The Data Standard
In this episode of The Data Standard, Catherine Tao and Vinoo Ganash talk about large-scale data and data processing challenges. Vinoo starts the conversation by explaining his current obligations and how his company uses data to find working solutions for a wide range of problems.
Then he talks about OLTP and OLAP models and how large-scale data can help improve workflows and offer better results. Optimization is needed for every specific application, and Vinoo talks about the methods he uses to enhance existing platforms. Even when the newly developed systems show positive results, the work is never done, as optimization is a constant, dynamic process.
He then goes over the techniques used to extract useful data. The distribution of data and data types have the most significant impact on data quality. Vinoo talks about the challenges of working with data, where a simple data movement can present a massive problem. Constant profiling is needed to help scale the data and make sure that the computing power can cope.
Finally, the guest talks about handling messy data that doesn’t have the required quality. He talks about the multiple problems data scientists have to consider to sort messy data to make it more useful.
Meet The Host
Data scientist at The Data Standard
Catherine Tao is a tech enthusiast looking for new methods for building connections with businesses around the world. Her extensive knowledge of data science allowed her to develop new solutions and implement them into existing ecosystems. She is currently working as a Data scientist and Exclusive Podcast Producer at The Data Standard.
Meet The Guest
Chief Technology Officer at Veraset
Vinoo Ganesh is currently working as the CTO at Veraset. He has extensive knowledge of retrieval technology and distributed storage. Vinoo has also spent a lot of time working on data analysis technologies in multiple companies, including Spark, Hadoop, Cassandra, and Elastic Seach.