With three months of remote work under their belts, data scientists are feeling the pinch of being separated from their teams. As many have mentioned on The Data Standard’s monthly members events (sign up here!), data scientists are eager to increase collaboration, working with data engineers, analysts and business users to develop the best models and data applications. An IBM and MIT study found that data scientists “collaborate extensively, but they also perform a variety of roles and work with a range of stakeholders during different stages of a data science project workflow.”
Lockdowns have created new challenges in working together — but none that these digital natives aren’t able to embrace through the use of some very effective tools. Dailymotion, a UK-based video storage platform, is one company that has developed a strategy for efficient collaboration, developing simultaneous work tracks that ultimately come together in the end.
Collaboration and remote work tools are showing such success across the country and across industries that many of them are seeing outsized investments so that they can grow further and provide even more value: Figma and Notion each raised $50 million, and Microsoft’s Teams has surpassed 75 million daily users.
Here are three proven tools to keep teams connected and productive.
Adopting collaborative platforms. Zoom and Slack have become as de rigeur to remote work life as breathing air. Moving to the next level requires a collaborative platform. Tyler Folkman of Branded Entertainment Network uses Comet to enable remote teams to track, compare, and visualize models: It “provides a meta-machine learning platform, runnable in the cloud or on-premise/VPC, that allows data science teams to do just this: reproduce full experiments (and not just code), manage users across large distributed data science teams, and provide managers insight into team contributions and performance,” he says.
Synchronizing remote teamwork. Charting a task map to reach clearly set goals is the realm of project management software. Data science team leaders can choose from a variety of programs that enable individual and shared progress. Dan Malowany, head of deep learning Research at Allegro turns to Jira to tackle remote project management.
“In Jira’s next-gen product, you can choose either a Kanban or Scrum template. As I focus on research, I found Scrum sprints to be less of a natural fit,” he says. “I prefer to work with the Kanban template that has a roadmap tab built in. I split the team goals into epics, and appoint each researcher to an epic. The researcher, in turn, split the epic into tasks and subtasks. On the project board, I also add a ‘priority’ column to which I push tasks when urgent issues arise.”
Notion is also grabbing the attention of users across industries, including data scientists.
Making progress visible.Having shared goals is a given for high-functioning teams. But visibility into movement toward the finish line is hard when workers are physically disconnected. Pascal Bugnion, data engineer at Faculty says it’s important to share visible progress toward the finish line.
“Having a good source code manager like GitHub, GitLab or Bitbucket helps with this,” he says. “The steady flow of PRs, comments, and reviews gives a sense of motion to the team. Continuous integration pipelines running from this source code manager also increase visibility.”
Companies also turn to Airflow to keep track of data processing or model retraining pipelines. Systems of records like MLFlow are excellent tools during experimentation to keep the entire team in the loop.
The best tool of all. The secret weapon deployed by the most successful data science teams? Chitchat. Don’t make every Zoom call and Slack channel all business. Be sure to leave unstructured team time to discuss weekend plans, the latest Netflix series, and anything else to help foster camaraderie and keep the team connected at such a disconnected time. There’s nothing like finding common ground over “Tiger King.”