Introduction to Responsible AI: Unpacking the harms
The latest in our Responsible AI blog series, the CallMiner Research Lab explores two of the main categories of harms that AI outputs can cause: Harms...
The Team at CallMiner
November 18, 2019
Data mining is one of the most insight-giving and potentially the most powerful tool businesses can harness in the modern economy. The ability to recognize patterns comes with a myriad of benefits including:
Choosing tools to gather massive amounts of data can be tricky. To assist you in finding the right solution, we’ve compiled a list of tips, quotes and other insights from experts around the web. These snippets are in no particular order and encompass a range of both buying tips and insights for implementing and using the software of your choice.
“Data mining makes it possible for businesses to get intelligible insights out of their data, whether it is open source data or not. However, the data mining process is an extensive one, which requires the combination of steps. The data mining process differs from use case to use case and company to company.” – Nida Fatima, A Quick Guide to Data Mining, Astera; Twitter: @AsteraSoftware
“Data mining relies on the actual data present, hence if data is incomplete, the results would be completely off-mark. Hence, it is imperative to have the intelligence to sniff out incomplete data if possible. Techniques such as Self-Organizing-Maps (SOM’s), help to map missing data based by visualizing the model of multi-dimensional complex data. Multi-task learning for missing inputs, in which one existing and valid data set along with its procedures is compared with another compatible but incomplete data set is one way to seek out such data. Multi-dimensional preceptors using intelligent algorithms to build imputation techniques can address incomplete attributes of data.” – Gopinadh Gullipalli, 12 Data Mining Tools and Techniques, Invensis Technologies; Twitter: @Invensis
“Retail and marketing organizations are collecting massive amounts of data on customers, but they’re not always getting all the use out of this information that they could. As new privacy regulations promise to constrain the use and sharing of private data, it’s becoming increasingly important to use the data wisely.
“In fact, according to a recent Adobe survey of nearly 13,000 marketing and advertising professionals, the biggest opportunities B2C companies saw for 2019 was the use of personalized, data-driven marketing. In addition, 55% of organizations said the biggest shift they’re seeing is the better use of data for more effective audience segmentation and targeting. In second place, at 42% in the Adobe survey, was improving customer intelligence and insights for a holistic customer view.” – Maria Korolov, 6 tips for effective customer data mining, TechTarget; Twitter: @BizAnalyticsTT
“Data is important but data without context is meaningless. Social listening competency matters as people are pouring out their hearts to you. “Social listening in its purest form doesn’t assume anything – it’s an opportunity to answer questions that you don’t even know you should ask. It helps you decipher what signals to take in, which are being amplified in the popular culture, and which need responses from your brand.
“Respond to fans and detractors. ‘Hug your haters,’ as Jay Baer says, ‘Haters are not your problem; ignoring them is.’
“Instead of starting yet another dashboard, start sending a company-wide email about what people are sharing about your company – positive and negative. This is how you add context to your data.” – Irina Jordan, Try this Data. Mining Tip for Increased Social Engagement, Fanatics Media; Twitter: @FanaticsMark
“Data mining techniques are used to extract useful knowledge from raw data. The extracted knowledge is valuable and significantly affects the decision maker. Educational data mining (EDM) is a method for extracting useful information that could potentially affect an organization. The increase of technology use in educational systems has led to the storage of large amounts of student data, which makes it important to use EDM to improve teaching and learning processes. EDM is useful in many different areas including identifying at-risk students, identifying priority learning needs for different groups of students, increasing graduation rates, effectively assessing institutional performance, maximizing campus resources, and optimizing subject curriculum renewal. This paper surveys the relevant studies in the EDM field and includes the data and methodologies used in those studies.” – Abdulmohsen Algarni, Data Mining in Education, International Journal of Advanced Computer Science and Applications
“Data miners are the backbone of any blockchain as they solve cryptographic problems to verify transactions. The purpose of blockchains is to provide internal incentive structures so that they organize themselves decentralized. In order to do this, the miners are paid small fees after verifying the transactions. By verifying the transactions by the miners, the data blocks can be attached to the respective block chain. The respective cryptocurrency is then distributed to the miners as a reward for performing this computationally intensive task. The mining efficiency can be controlled by optimizing the respective ASICs server. Changes in facility hash rate, power capacity, and temperature directly affect the success rate. With an experienced management, mining can be targeted to supply and demand. Also, the higher the price, the more miners are active on the blockchain, which in turn results in a higher network hash rate.” – Florian Grummes, Atlas Cloud: A New And Promising Blockchain And Data-Mining Insider Tip, Midas Touch Consulting; Twitter: @FlorianGrummes
“Search data offers a comprehensive overview of the types of products your shoppers are looking for. In addition to indicating which products are in high demand, and which are lagging, it reveals searches for items that aren’t necessarily in your catalog—but may perform well if you had them. Use these search signals to adjust your inventory and ensure you are prepared for a rise in demand.
“It’s also effective at revealing gaps in the product discovery journey. For instance, if one term is attracting a high volume of searches but converting at a very low rate, that’s an indication that there may be issues with assortment, pricing, or your search results.
“By using your search data to identify these gaps, you’ll be able to discover opportunities for growth and improve the overall search experience for your users.” – Dana Naim, Six ways to boost conversions with search intent data, Digital Commerce 360; Twitter: @twiggle_com, @DC360_Official
“Identifying the best predictors from a wide range of independent variables is the first part of the modelling process. Throwing all the information in, testing multiple models and then refining the selection process down, all in the first day of your project, gives a leap forward in productivity. This is known as “throwaway modelling” and it’s a valuable part of the process because being able to chuck everything in, throw out the things that don’t work and keep the cream from the top means that the bias of the analyst or the slowness of programming a new routine does not interfere with the accuracy of the results. If you skip this part of the process, then there’s a risk that you’ll miss an important relationship in your data that you hadn’t thought of or which doesn’t fit with your own pet theory.” – Rachel Clinton, Nine tips for effective data mining, Smart Vision Europe; Twitter: @sveurope
“Used responsibly, big data can effectively facilitate and transform customer/company relationships. Using it to examine the world from the customer’s point of view, organizations get to ‘know’ their customers a lot better, so can minimize the gaps and disconnects in their marketing strategies, leading to better engagement through more personalized campaigns and communication.
“The better the big data connection, the better the results. Once brands are effectively using their data – to feed back customer satisfaction levels, willingness to recommend, and if, when and why customers intend to re-purchase – they can adapt and improve. Big data knowledge has a correlating effect on positively enhancing the customer experience: for example, fewer calls may be made to customer services, marketing can be more accurately targeted (also reducing overheads), and so customer satisfaction can be increased.” – Kiran PV, Using Big Data To Improve Customer Experience, Analytics Training; Twitter: @jigsawacademy
“For many, artificially intelligent integrations, such as chatbots, predictive personalization and virtual assistants automate customer relationship tasks to help businesses manage the wide spectrum of communication avenues. All three of these examples use artificial intelligence to respond to simple cutomer inquiries or relay these conversations to the necessary team member, which can help your business share insights with its consumers even after the office has closed.
“Perhaps most importantly, AI gives even small startups and organizations the ability to inform their customer communication strategies through data-driven insights. By data-mining through hundreds or thousands of independent customer journeys, you can better allocate your time and resources to match the demands of your customers, all without dedicating any more of your time to acquiring these insights.” – Leslie Yancy, Tips for digitally transforming your communication strategies, TechTalks; Twitter: @bdtechtalks
“So Facebook (and other Web giants) accumulate all our personal data points over time. The more data there is in one place, the more value it has for data mining. Over time, and in context of other individual data points, it becomes Big Data. Using data integration, it’s then mixed on the back-end with other data sources that, as end-users, we’ll never be aware.
“Increasingly, identifiable data collection is happening in more dimensions than are ever understood by most users. Some apps now offer ‘general’ surveys or take note about group preferences but are really harvesting detailed notes that track us individually.
“These apps, we know, use data analytics to analyze ‘friends of friends’ comments to compile data about us. They even determine our current emotional state from textual analysis or online behavior. It’s now possible to correlate how sad or depressed someone might be by analyzing the volume and variety of their online interactions.
“One big pitfall in data analysis is simply failing to look at your data. However, real-world experiments often yield complex, high-dimensional results however, and when your tabular dataset has 7 dimensions, simply looking at raw values is not as straightforward as it seems.
“Dimensionality reduction techniques are useful here – they allow you to take high-dimensional, complex data and transform them into lower-dimensional spaces (2D or 3D), making them more visually intuitive. Dimensionality reduction techniques like PCA, t-SNE or Autoencoders are common ways to begin exploring your data.
“Understanding how dense or sparse your data are, whether your data are normally distributed, and how your data covary are all questions to address during exploratory analysis in order to build better predictive models.” – Alex Harston, Data Mining Techniques: From Preprocessing to Prediction, Technology Networks; Twitter: @Tech_Networks
“Today, companies in every industry are overwhelmed by an unprecedented amount of data.
“The influx of information is so voluminous and complex that traditional data analytic systems are ill-equipped to handle it. In fact, interpreting data has become such a major factor in day-to-day business operations that the field of Big Data has formed, and companies are hiring Chief Data Officers (CDO) — a relatively recent phenomenon.
“The tension underscoring data analytics and digital transformation is between fast business intelligence and slow business intelligence. Most businesses have no problem capturing the data, but now, they need to be able to react to that data in real time. And as data is coming in faster than most machinery’s capacity to handle it, I would suggest saying that enterprises find new ways to address the opportunity.” – John Dillon, Today’s Companies Are Overwhelmed With Data. Here’s How The Smart Ones Will Use It, Forbes; Twitter: @aerospikedb, @Forbes
“EasyJet is also adopting AI tools for predicting maintenance, using London-based startup Aerogility’s decision support tool set that features intelligent software agents capable of representing every aircraft in the low-cost carrier’s fleet. Every aircraft, including its individual software parts and upgrades, modifications and operating profiles are represented in Aerogility’s web-based application and SQL database capable of configuration and simulation output data, including analytics, schedules, and model configuration parameters.
“The tool is used by EasyJet to automate daily maintenance planning for its fleet, including the forecasting of heavy maintenance, while simultaneously factoring in existing plans with third-party suppliers and incorporating individual fleet modification and upgrade schedules. EasyJet first started using the new tool in December 2017 and has continuously upgraded its capabilities, which now include forecasting of engine shop visits and landing gear overhauls.” – Woodrow Bellamy III, Airlines are Increasingly Connecting Artificial Intelligence to Their MRO Strategies, Avionics International; Twitter: @AvionicsMag
“Data mining allows businesses to sift through all the chaotic and repetitive noise in their data and understand what is relevant, then make good use of that information to assess likely outcomes. The process identifies patterns and insights that can’t be found elsewhere, and by using automated processes to find the specific information, it not only speeds up the time it takes to find the data but also increases the reliability of the data.
“Once the data is gathered, it can be analyzed and modelled to convert it into actionable insights for the business to use.” – Clare Hopping, Bobby Hellard, What is data and big data mining? An easy guide, IT Pro; Twitter: @ITPro
“Of course, using this data without compromising users’ privacy is a challenge. When dealing with location information, anonymization can only take you so far. But there is a neat solution. In exchange for their data, passengers could receive a wealth of benefits, including more flexible routes and timetables, predictive of need at any given hour. The level of service could be directly linked to the amount of data a passenger chooses to share.
“By combining these data with efficient ticketing across a range of transport modes, including bus, tram, train, taxi and others, it would be possible to create a flexible and responsive system, which can tailor transport solutions to every person’s needs.” – Marcin Budka, Manuel Martin Salvador, Harvesting big data could bring about the next transport revolution, right now, The Conversation; Twitter: @ConversationUS
“Benefitting both financial firms and customers, predictive banking is a useful tool in combating fraud. Feedzai is an innovative platform that has been employed by CitiBank that constantly evaluates huge amounts of data to monitor accounts and potential threats and notify customers of suspicious activity.
“By aggregating a customer’s financial data, predictive banking can detect even the slightest irregularity in spending that might otherwise go undetected. All activity, online and offline and across all devices is gathered, analyzed, and shared, providing a complete picture of each client.” – Matthew Flynn, Predictive banking could transform banks and financial firms, to their benefit and the customer’s, BOSS Magazine; Twitter: @bossmagmedia
“The big change feeding into the predictive analytics boom is not just the advancement of ML and AI, but that it’s not just data scientists using these techniques anymore. BI and data visualization tools, along with open-source organizations like the Apache Software Foundation, are making Big Data analysis tools more accessible, more efficient, and easier to use than ever before. ML and data analysis tools are now self-service and in the hands of everyday business users — from our salesperson analyzing lead data or the executive trying to decipher market trends in the boardroom to the customer service rep researching common customer pain points and the social media marketing manager gauging follower demographics and social trends to reach the right targeted audience with a campaign. These use cases are just the tip of the iceberg in exploring all of the ways predictive analytics is changing business…” – Rob Marvin, Predictive Analytics, Big Data, and How to Make Them Work for You, PCMag; Twitter: @PCMag
“Data Operations (DataOps) is rapidly emerging as a discipline for organizations that continue to struggle with the management of data as a shared business asset. DataOps brings a set of data engineering principles which borrow from the DevOps software development movement. The intent is to deliver “rapid, comprehensive, and curated data” to business analysts and decision-makers. We expect 2019 to be a breakthrough year for DataOps approaches as firms strive to derive value quickly and efficiently from their data assets.” – Thomas H. Davenport, A 2019 Forecast for Data-Driven Business: From AI to Ethics, Forbes, President’s Distinguished Professor of Information Technology and Management at Babson College; Twitter: @tdav, @Forbes
“It has been observed that a poor or undefined business need is often the cause of a failed implementation project. Businesses should emphasize on clear objectives of their big data strategy, such as, identifying high value customers to offer specific products or services, process improvements for optimizing cost, and so on.
“Once a business has set its definite goals, the next task is to identify the key metrics or insights to fulfil it. CDOs can quest on the most appropriate insights which can be offered as a service to accomplish business objectives. In case of multiple objectives, they can prioritize them based on its ease of implementation and impact to the business. Using proof of concepts to evaluate hypothesis can reduce future surprises while scaling up. POC also helps in identifying gaps in technical solutions and in providing insights on their business usefulness.” – Jai Manral, A reference guide for implementing data mining strategy, Mindtree; Twitter: @Mindtree_Ltd
“This step is used to reduce the number of tools to a number using which we can easily manage. The tools that don’t satisfy the constraints set by the organization, e.g. If the organization has taken a decision about using mac tools, then we should eliminate the non-mac tools. It is a simple yet valuable phase that helps us in wasting less amount of time.” – Vikas Verma, Sunil Dhawan, Methodology for Selection of a Data Mining Tool, International Journal of Software & Hardware Research in Engineering; Twitter: @ijshre
“Red Roof Inn, a well-known hotel chain applied data mining methods to a diversified set of available content that included weather information, flight cancellation, airport locations, hotel locations, etc.
“This set of data allowed them to find clients that needed a room since their flights were canceled.
“Usually, bad weather is a bad sign for hotels, as it reduces traveling. However, Red Roof Inn found a way to improve their income. Reportedly, their business increased by 10% in 2014! Go data!” – Alex Seryj, Data Mining: Commodity or necessity of the 21st century?, QArea; Twitter: @QArea
“Data is the oil of the 21st century, and oil equals money. Data mining tools will help you generate more revenue by creating informational assets, used both by sales and marketing departments. They can study the behavior of your clients, their location, position and create solid marketing strategies.
“Enterprises thrive on the features of data mining tools, with them they can get detailed business intel, plan their business decisions and cut costs drastically. They can also help you detect anomalies inside your models and patterns to prevent your system from being exploited by third persons.
“With all those features on board, you won’t need to implement complex algorithms from the ground up. Moreover, you can adjust those features with some additional tweaking to the code base (if it’s an open source tool), as your demands grow.” – Yaroslav Panyok, Data Mining Tools: The What, The Why And The How, Snovio Labs; Twitter: @snov_io
“Once you’ve defined what you want to know and gathered your data, it’s time to prepare your data – this is where you can start to use data mining tools. Data mining software can assist in data preparation, modeling, evaluation, and deployment. Data preparation includes activities like joining or reducing data sets, handling missing data, etc.
“The modeling phase in data mining is when you use a mathematical algorithm to find pattern(s) that may be present in the data. This pattern is a model that can be applied to new data. Data mining algorithms, at a high level, fall into two categories – supervised learning algorithms and unsupervised learning algorithms. Supervised learning algorithms require a known output, sometimes called a label or target. Supervised learning algorithms include Naïve Bayes, Decision Tree, Neural Networks, SVMs, Logistic Regression, etc. Unsupervised learning algorithms do not require a predefined set of outputs but rather look for patterns or trends without any label or target. These algorithms include k-Means Clustering, Anomaly Detection, and Association Mining.
“Data evaluation is the phase that will tell you how good or bad your model is. Cross-validation and testing for false positives are examples of evaluation techniques available in data mining tools. The deployment phase is the point at which you start using the results.” – Data Mining Tools, RapidMiner; Twitter: @RapidMiner
“While mining ‘Big Data’ has myriad benefits, it also presents some unique challenges. Working with enormous volumes of data introduces concerns around data quality and accuracy, efficiency and scalability, and costly investments into software, servers and storage hardware that handle it.
“Aggregating data from an array of sources — CRMs, ERP platforms, social media, and other systems — makes it difficult to guarantee that the data is clean and usable. Poor data quality such as incomplete, inaccurate, and duplicate data can wreak havoc on mining activities and negate the value of insights gained. Plus, combining data from different sources also comes with the added challenge of standardizing formats, as rich data can take many forms: multimedia files (audio, video and images), geolocation data, SMS, social media data, among many others.
“The sheer volume of data required for deep mining activities means data mining algorithms need to be efficient, powerful, and scalable.” – Garrett Alley, What is Data Mining?, Alooma; Twitter: @aloomainc
What’s the most important feature you look for when comparing data mining tools?
Subscribe to our monthly e-newsletter to receive the latest on conversation analytics