Business Analyst Training and Product Management Courses

View Original

How can Business Analyst contribute to feeding the ML (Machine Learning)?

Your AI is hungry. It needs data - whether structured or unstructured - to build it’s cast library of knowledge into the Machine Learning (ML). As business analysts we have struggled to stop being note takers but with the advent of AI we are fast becoming note pack rats.

All those meeting notes, action items, emails, hallway discussions, brain storming sessions, requirements workshops, and just about any piece of documentation is valuable to ML. Why? Ai is about finding patterns, correlations, relationships, and connections in the data. More data means better decisions.

In this article, we'll delve into the pivotal role of business analysts in leveraging both structured and unstructured data to populate ML, shedding light on the nuances and opportunities inherent in each approach.

Structured Data: The Backbone of ML

Structured data refers to information that is organized in a predefined format, typically stored in databases or spreadsheets. Examples include numerical data, categorical variables, and timestamps. Structured data lends itself well to traditional analytical methods and is the backbone of many ML applications.

As a business analyst, our first contribution to feeding structured data into ML lies in data collection and curation. We meticulously gather data from various sources—internal databases, CRM systems, ERP platforms, and external APIs—ensuring its quality, accuracy, and relevance to the problem at hand.

Next, we engage in data preprocessing, a critical step in preparing structured data for ML. This involves tasks such as cleaning the data to remove duplicates and outliers, handling missing values, and standardizing formats. By leveraging our domain knowledge and analytical skills, we ensure that the data is in optimal condition for training ML models.

Once the data is preprocessed, we embark on feature engineering, another essential aspect of feeding structured data into ML. Feature engineering involves selecting, transforming, and creating new features from the raw data to enhance the predictive power of the ML model. As business analysts, we draw upon our understanding of the business domain to identify relevant features that capture the underlying patterns and relationships in the data.

Finally, we collaborate closely with data scientists and ML engineers to select the appropriate ML algorithms and techniques for modeling the structured data. We provide valuable insights into the business context, objectives, and constraints, guiding the selection and evaluation of ML models to ensure alignment with organizational goals.

Unstructured Data: Tapping into a Wealth of Insights

In contrast to structured data, unstructured data refers to information that does not have a predefined format and is often text-heavy or multimedia-based. Examples include text documents, social media posts, images, and videos. Unstructured data poses unique challenges and opportunities for ML, requiring innovative approaches to extract meaningful insights.

As business analysts, our role in feeding unstructured data into ML begins with data acquisition. We scour a myriad of sources—emails, customer feedback, social media platforms, news articles, and more—to gather unstructured data relevant to the business problem at hand. We leverage advanced web scraping techniques, APIs, and data mining tools to collect and aggregate diverse sources of unstructured data.

Next, we engage in text mining and Natural Language Processing (NLP) to extract insights from unstructured textual data. NLP algorithms analyze the text, identify key entities, sentiments, and topics, and transform the unstructured data into structured formats that can be fed into ML models. By applying techniques such as sentiment analysis, topic modeling, and named entity recognition, we uncover valuable insights hidden within vast troves of unstructured text.

In addition to text data, we also harness the power of image and video analysis to extract insights from multimedia sources. Computer vision algorithms can analyze images and videos to identify objects, patterns, and anomalies, providing valuable context and insights for ML applications. Whether it's analyzing product reviews, monitoring social media sentiment, or detecting brand logos in images, business analysts play a pivotal role in leveraging unstructured data to enhance ML capabilities.

Contributing to the Process

Here's how business analysts contribute to the process:

Data Collection: Business analysts are often tasked with identifying relevant sources of data that can be used to train ML models. This may involve gathering data from internal databases, third-party sources, or external APIs.

Data Cleaning and Preprocessing: Raw data is often messy and inconsistent. Business analysts are responsible for cleaning and preprocessing the data to ensure its quality and consistency. This may involve tasks such as removing duplicates, handling missing values, and standardizing formats.

Feature Engineering: Feature engineering involves selecting, transforming, and creating new features from the raw data to make it more suitable for ML algorithms. Business analysts leverage their domain knowledge to identify relevant features that can improve the performance of the ML model.

Here are some example of data features:

Numerical Features - Age, Income, Temperature, Number of transactions, stock prices

Categorical Features - Gender, Martial Status, Product Category, geographic regions, customer segments

Binary Features - Yes/No, True/False, Presence/Absence

Textual Features - product descriptions, customer reviews, emails, news articles

Temporal Features - Date of purchase, Time of day, day of week, seasonality, day light savings time

Geospatial Features - Latitude, Longitude, Postal Codes, Landmarks, Mountains, Waterfalls

Image Features - Pixels, color histogram, shape descriptions

Audio Features - Radio frequency, pitch, tempo, rhythm, Mel-frequency cepstral coefficients (MFCCs), sound waves

Derived Features - Ratios, proportions, differences or changes over time, aggregated statistics (mean, median, stand deviation)

Metadata Features - Data source, origin of data, timestamps, data quality or reliability indicators.

These are just a few examples of features found in raw data. In practice, the choice of features depends on the specific problem being addressed and the characteristics of the dataset. Effective feature selection and engineering are essential for building accurate and robust machine learning models.

Data Labeling: In supervised learning tasks, where the ML model is trained on labeled data, business analysts may be involved in labeling the data. This entails assigning labels or categories to the data based on predefined criteria.

Data Exploration and Visualization: Business analysts use exploratory data analysis techniques to gain insights into the data and identify patterns or relationships that may be useful for training the ML model. Data visualization tools are often employed to visually represent the data and facilitate understanding.

Data Selection and Sampling: Business analysts may need to select a subset of the data or perform sampling techniques to ensure that the ML model is trained on representative data that generalizes well to unseen examples.

See this product in the original post

Model Evaluation and Validation: While the technical evaluation of ML models is typically done by data scientists or machine learning engineers, business analysts play a role in interpreting and validating the model results from a business perspective. They assess whether the model meets the business objectives and requirements.

Overall, business analysts collaborate closely with data scientists, machine learning engineers, and other stakeholders to ensure that the data used to train ML models is relevant, clean, and representative of the problem domain. Their domain expertise and understanding of business requirements are invaluable in driving the success of ML initiatives within organizations.

Empowering ML with Business Analyst Expertise

Business Analysts are uniquely positioned to contribute to feeding data into ML models, leveraging both structured and unstructured data to unlock actionable insights and drive business value. Whether it's curating structured data, engineering features, or mining unstructured text, our expertise in data analysis, domain knowledge, and strategic thinking is indispensable in harnessing the full potential of ML.

By embracing the synergies between business analysis and ML, organizations can unlock new opportunities for innovation, efficiency, and competitive advantage. As business analysts, let us seize the moment and empower ML with our unique blend of expertise, creativity, and analytical prowess, driving organizational success in the data-driven era.

Tags #businessanalysis #ai