Data Quality vendor Naveego looks to Big Data with new platform release

Naveego has refocused their platform for tracking, logging and cleansing data to be able to handle Big Data volumes, to take advantage of what they see as a growing opportunity in this space.

Traverse City, Michigan-based Naveego, which makes cloud-based Data Quality Solutions [DQS] that detects and eliminates data quality issues across systems, has added a Big Data focus to the new 2.0 version of their Complete Data Accuracy Platform, which they are announcing today. Facilitated by a strategic partnership with HortonWorks, the new release enables the Naveego platform to handle Big Data ingestion and storage, leveraging Hadoop technologies like Apache Kafka and Apache Spark. The platform’s Data Flow log has also been enhanced, with a patent-pending capability to track, log and cleanse all data over distributed systems.

Naveego began as a BI platform inside Michigan MSP Safety Net, and was spun out in December 2014, with a focus on providing Master Data Management on a Data Quality Platform. It follows what the company calls a 1-10-100 Rule of Data Quality.

“The 1-10-100 rule is that while it takes $1 to verify a record as it is entered, it takes $10 to cleanse it later, and it ultimately costs $100 if a mistake is made and the problem persists,” said Derek Smith, Naveego’s CTO. “Data Quality is about validating the data to get the cost as close to that $1 point as possible.”

Derek Smith, Naveego’s CTO

Smith, who birthed Naveego from its origins within Safety Net, and was Naveego’s original CEO, moved to the CTO role earlier this year with Katie Horvath coming in as CEO.

“This was a planned succession,” he said. “We are getting ready for the next round of growth. I was the architect and my specialization is on the tech side. The plan was to get all the players in the right place for growth, and the best place for me is in the CTO role. It allows me to focus on the part of the role that I love.” Another part of the plan was bringing in a new VP of sales and marketing, Don Milner, who has an extensive background selling to enterprise software customers.”

Naveego began selling entirely through a partner channel, and while that is still basically the model, the go-to-market has become more complex as the company has expanded.

“We have placed a new focus on strategic partnerships, particularly in the Big Data space, and on attracting more very large customers from the Fortune 1000,” Smith said. “We have done a little direct selling to get key flagship customers, and have developed relationships with OEMs and systems integrators. Our channel has added business intelligence and analytics partners, with whom we pair on data quality around dashboards. This all helps the channel because it brings in more big names, which enhances the reputation of the company.”

Naveego has also transitioned itself to place much more emphasis on Big Data then it did coming out of the gate.

“What we have identified particularly is companies struggling with New Data that goes beyond the traditional data in ERP systems,” Smith said. “It includes streaming data, IoT data. interactive and behavioural data,and social media. Customers struggle with digital transformation because it’s easy to collect data and dump it into a data lake, but making sense of what they have, and making sure it is accurate for consumption is an issue. Data scientists rely on their data being accurate, and spend 80 per cent of their time cleansing data. So we are going into the Big Data space to address these issues, and we have upgraded the platform to work very well with these Big Data sets. We see a huge opportunity to take Data Accuracy into the AI machine learning space.”

Smith described the changes necessary to do this as involving a major overhaul of the platform.

“We entered a strategic partnership with HortonWorks to help bring us into this space,” he said. “We completely revamped the backend with Big Data technology and upgraded systems to deal with very high volumes of data. This was a complete restructuring to the core back-end system. Leveraging the Hadoop Big Data technologies lets us provide complete data accuracy in these environments. Since we still have secure distributed computing from client networks, we are getting into more of these secure verticals like health care and financials where Big Data capability is essential. We don’t require the customer to make any network changes which would open up security holes, and we use encryption and Web services to integrate with the cloud in a highly secure fashion.”

A key part of the revamp was enhancing the Data Flow Log, to track all data over distributed systems with multiple processors and provide full end-to-end visibility to users.

“We have a number of patents in how we do the distributed master data management and the way that we collect the metadata,” Smith said. “It lets the customer understand what they have in their data lakes, and what sets of data in a data lake are related to each other. We scope all that as data flow.”

Naveego now supports all the major cloud vendors, and VMware and on-prem with VMware.

“What’s new here is that we are now fully using Docker containers and Kubernetes,” Smith indicated.