New York-based Rasgo, a startup developing a platform for feature store workflows, today announced that it closed a $20 million series A funding round led by Insight Partners, with participation from Unusual Ventures. CEO Jared Parker says that the funding will be used to scale Rasgo from around 13 employees to over 50 in 2022 â€” mostly on the engineering side â€” and to increase awareness of the companyâ€™s solutions within the data science community.
In machine learning, features are data signals that AI models rely on to make predictions. Features are stored during training in batches to train multiple variations, and the same features need to be available during inference for predictions. Feature stores automate data prep for this type of analytics. In a recent report, startup Tecton said it expects 2021 to be a year of â€œmassive feature store adoptionâ€ as â€œmachine learning becomes a key differentiator for technology companiesâ€ and incumbents like Amazon launch new products to address the growing market segment.
Rasgoâ€™s feature store solutions attempt to surface quality issues in raw data and execute resolutions to cleanse data and create features. It enables users to explore feature details including histograms, statistics, missing values, outliers, and data quality, as well as transform raw features into new, derivative features with out-of-the-box feature transforms.
Rasgo was founded in 2020 by Parker and Patrick Dougherty, whoâ€™ve been in the data science and machine learning space for most of their careers. Dougherty worked in data science at Dell and later moved into consulting at Slalom, where he led and managed a large practice of data scientists and engineers. Parker was previously the managing director of sales at Domino Data Lab and headed global financial services accounts at Confluent.
â€œAs we engaged with data scientists, we consistently heard them scream in frustration: â€˜This is cool, I have this acceleration in the modeling and math layer, but I went to school, I got my [graduate degree] to solve critical problems with models. Why am I spending the vast majority of my time extracting, exploring, cleaning, joining, and transforming raw data into a set of features that can be consumed by my model?!â€™ We knew this needed to change,â€ Parker told VentureBeat via email.
Feature store potential
Data scientists spend the bulk of their time cleaning and organizingÂ data, according to a 2016Â surveyÂ conducted by CrowdFlower. In a recent AlationÂ report, a majority of respondents (87%) pegged data quality issues as the reason their organizations failed to implement AI. Thatâ€™s perhaps why firms like Markets and Markets anticipateÂ that the data prep industry, which includes companies that offer data cataloging and curation tools, will be worth upwards of $3.9 billion by the end of 2021.
Parker acknowledges that Rasgo has competitors in the feature store solution space, including Molecula. But he argues that the company is unique in that it was developed as a fully managed software-as-a-service offering that doesnâ€™t store or process any raw data. Moreover, he notes, Rasgo offers free tools like PyRasgo, a feature engineering experience thatâ€™s seen 70,000 downloads.
Rasgo has five customers in the finance, manufacturing, biotech, retail, and alternative energy fields, two of which are Fortune 500 companies, according to Parker. One is a retailer with 7 time series datasets that drive predictive models, helping its demand forecasting and routing teams make decisions. With these models, the retailer optimizes what products need to go where for â€œjust in timeâ€ inventory, as well as which logistics routes are ideal to get goods shipped. Rasgo acts as the central feature store to enable the data science team to build a single feature repository on all time series datasets. Data scientists use these features in every model they train and deploy, directly from Rasgo.
â€œThe pandemic has given us the opportunity to build our company as a fully remote-first workforce. Due to this, weâ€™ve been able to hire world class engineering talent across 8 different states. This has been a critical differentiator to our business early on,â€ Parker said. â€œRasgo is dedicated to accelerating adoption of the data cloud for data science and has developed an integration with Snowflake that allows their customers to unlock net new use cases and develop high quality ML features that are ready for production.â€