My latest activity as a data scientist after two years in my current company
What I have did for past 2 years
Hello all, I am sorry for not updating much in my newsletter. I would try to publish weekly as I want to focus more on creating values with everyone.
I was on hiatus for a while here because I was overwhelmed with my current activity, especially as a professional data scientist in my current company. It is an amazing transformation that the whole team and I try; with that, I need time to focus on my work.
However, the holiday soon came, and I could try shifting my gear to focus on writing more and share my experience + knowledge with everyone! I am excited that I could actively write once more.
So, today’s topic is the personal activity I did as a data scientist (2 years). I want to share this experience with everyone to know data scientist activity in corporate work, especially in my company. Let’s get into it.
1. Generate Data Science Framework
Every company has its way of producing value with a data science project. However, the framework for each company is different because each company has a unique way of utilizing its data (from data mining, storage to analysis).
Usually, data scientists do not work at a high level to set up the overall framework for everybody; but, my responsibility scope including setup the overall data science framework. So, these few months, I was busy thinking about setting up end-to-end analytical activity within my company.
What do I try to set up in the data science framework thou? There are many variables, including technical and non-technical aspects such as:
Data quality check
Data pipeline
Business KPI tracker
Risk assessment
Customer survey
Platform Criteria
And many more. Above is an example of things we need to consider when setting up the data science framework.
I did thou to separate each part with my teammate to focus on, so I would not be overwhelmed with all the requirements. The bigger picture is to have a standard framework useful for creating value, which means it should not be a one-person person job.
Next time, I will probably share more detail regarding the framework, but for now, as it is still in making, I cannot share more detail (especially the chart). However, please comment if you want to know more regarding the data science framework.
2. Develop Many Machine Learning Model
I have been lucky enough to be trusted to lead various data science projects within my company in the past year. Many projects directly affect the company revenue and decision, which makes me more motivated. Some of the projects are:
Propensity-to-Rebuy Model
Persistency Model
Customer Complaint Sentiment Model
The Propensity-to-Rebuy model has a simple premise: the company wants to know which existing customers would rebuy another insurance policy if offered a new one.
The persistency Model is different; it concerns which customer would not pay the premium. Rather than produce revenue, this model try to retain an existing customer.
The customer complaint sentiment model. Seems simple enough, predict the customer complaint, but isn’t all complaint sentiment negative? Precisely! my company wants to have more classification from the complaint because there are many kinds of complaint, and the sentiment from the complaint might help.
What do I learn from leading a data science project this past year? Developing a machine learning model is easy—I could spend 1-2 days developing it. However, there are many more problems before and after model development. For example:
Data quality. No data table contains your perfect data for analysis and modeling. You need to do ETL from various data sources, and sometimes the quality is way different for each source. This takes quite a long time to clean it up.
Data Source. Speaking of data sources, sometimes the access to this data source is unavailable or limited. The bureaucracy are complicated and time consuming.
Faulty Execution. You might have the perfect model, but it is useless if the business does not follow up the model. The execution of the data science project needs to be discussed beforehand and constantly evaluated with the business.
Incomplete tracking. To track the useability of your model is another project. It is hard work because every project and business would have different KPIs and ways to track them. Often, the beginning of the project would not have proper tracking, which causes disruption.
People Leaving. The team might come and go, but if your team works together on the same project, it will affect you drastically. I have experienced it lately.
Many more experiences I learned from leading data science projects. It is a rewarding experience but certainly not an easy one. I advise every data enthusiast or data scientist to learn in-depth about the business you are currently working on (or interested in).
3. Moving to a cloud platform
2021 is a fun year for me in terms of learning new stuff. One of them is to learn about the cloud platform. While some of the work I did not necessarily need the cloud platform, my target to automate everything would need me to go there.
I learned this past year about a cloud platform such as AWS but haven’t had a chance to apply in my own company. Luckily, I am one of the people who push hard on cloud platforms because of the benefit and value we could get using the cloud.
Because of my persistence, I am also leading the implementation of the cloud platform and have a chance to set it up in the way I prefer. Moreover, I get many learning access and connection with people I haven’t connected with previously.
If you want something and you could find value in your idea, keep pushing it. The effort you did and the process you went through would be paid off, especially if you have planned what you would learn from it from the beginning.
I cannot share much more regarding this activity because it is still in the process, but let me share more updates and learning activities in the future.
4. Active in the company data activity
I am lucky that my company is a multi-national company where the existing data community within the company is big. The data community periodically held data events such as seminars, workshops, learning academies, and hackathons.
With many intra-activities in the data community, I could learn from various people in the other world, especially with people who work in the same industry and have the same projects.
What I love the most from this activity is the Hackathon project, where we are competing to solve the business problem using a data science project. I am truly learning a lot from this activity and gaining knowledge I never knew previously.
There are seem to be some more activities I did as a data scientist in these past two years, but I think the four points above have signified how much I did and transformed.
I never knew I could lead a data science project and generate a data science framework that the company would use. I think the next year would open a new experience for me, so I am looking forward to it.
If anybody wants to have a discussion with me to become a data scientist in the industry or just wants to have a general discussion, hit me up on my social media or in my newsletter.