Artificial intelligence (AI) is intelligence demonstrated by machines, as opposed to natural intelligence displayed by animals including humans. Leading AI textbooks define the field as the study of “intelligent agents”: any system that perceives its environment and takes actions that maximize its chance of achieving its goals. Some popular accounts use the term “artificial intelligence” to describe machines that mimic “cognitive” functions that humans associate with the human mind, such as “learning” and “problem solving”, however, this definition is rejected by major AI researchers.
AI applications include advanced web search engines (e.g., Google), recommendation systems (used by YouTube, Amazon and Netflix), understanding human speech (such as Siri and Alexa), self-driving cars (e.g., Tesla), automated decision-making and competing at the highest level in strategic game systems (such as chess and Go). As machines become increasingly capable, tasks considered to require “intelligence” are often removed from the definition of AI, a phenomenon known as the AI effect. For instance, optical character recognition is frequently excluded from things considered to be AI, having become a routine technology.
Artificial intelligence was founded as an academic discipline in 1956, and in the years since has experienced several waves of optimism, followed by disappointment and the loss of funding (known as an “AI winter”), followed by new approaches, success and renewed funding. AI research has tried and discarded many different approaches since its founding, including simulating the brain, modelling human problem solving, formal logic, large databases of knowledge and imitating animal behavior. In the first decades of the 21st century, highly mathematical statistical machine learning has dominated the field, and this technique has proved highly successful, helping to solve many challenging problems throughout industry and academia.
The various sub-fields of AI research are centered around particular goals and the use of particular tools. The traditional goals of AI research include reasoning, knowledge representation, planning, learning, natural language processing, perception, and the ability to move and manipulate objects. General intelligence (the ability to solve an arbitrary problem) is among the field’s long-term goals. To solve these problems, AI researchers have adapted and integrated a wide range of problem-solving techniques; including search and mathematical optimization, formal logic, artificial neural networks, and methods based on statistics, probability and economics. AI also draws upon computer science, psychology, linguistics, philosophy, and many other fields.
The field was founded on the assumption that human intelligence “can be so precisely described that a machine can be made to simulate it”. This raises philosophical arguments about the mind and the ethics of creating artificial beings endowed with human-like intelligence. These issues have been explored by myth, fiction, and philosophy since antiquity. Science fiction and futurology have also suggested that, with its enormous potential and power, AI may become an existential risk to humanity.
Goals of Artificial Intelligence
Following are the main goals of Artificial Intelligence:
- Replicate human intelligence
- Solve Knowledge-intensive tasks
- An intelligent connection of perception and action
- Building a machine which can perform tasks that requires human intelligence such as:
- Proving a theorem
- Playing chess
- Plan some surgical operation
- Driving a car in traffic
- Creating some system which can exhibit intelligent behavior, learn new things by itself, demonstrate, explain, and can advise to its user.
Advantages of Artificial Intelligence
High-Speed: AI systems can be of very high-speed and fast-decision making, because of that AI systems can beat a chess champion in the Chess game.
High Accuracy with less errors: AI machines or systems are prone to less errors and high accuracy as it takes decisions as per pre-experience or information.
High reliability: AI machines are highly reliable and can perform the same action multiple times with high accuracy.
Digital Assistant: AI can be very useful to provide digital assistant to the users such as AI technology is currently used by various E-commerce websites to show the products as per customer requirement.
Useful for risky areas: AI machines can be helpful in situations such as defusing a bomb, exploring the ocean floor, where to employ a human can be risky.
Useful as a public utility: AI can be very useful for public utilities such as a self-driving car which can make our journey safer and hassle-free, facial recognition for security purpose, Natural language processing to communicate with the human in human-language, etc.
Disadvantages of Artificial Intelligence
Every technology has some disadvantages, and the same goes for Artificial intelligence. Being so advantageous technology still, it has some disadvantages which we need to keep in our mind while creating an AI system. Following are the disadvantages of AI:
Can’t think out of the box: Even we are making smarter machines with AI, but still they cannot work out of the box, as the robot will only do that work for which they are trained, or programmed.
High Cost: The hardware and software requirement of AI is very costly as it requires lots of maintenance to meet current world requirements.
No feelings and emotions: AI machines can be an outstanding performer, but still it does not have the feeling so it cannot make any kind of emotional attachment with human, and may sometime be harmful for users if the proper care is not taken.
No Original Creativity: As humans are so creative and can imagine some new ideas but still AI machines cannot beat this power of human intelligence and cannot be creative and imaginative.
Increase dependency on machines: With the increment of technology, people are getting more dependent on devices and hence they are losing their mental capabilities.
Data Sources:
Primary And Secondary Sources of Data
To analyze, present, and interpret information from the data, there has to be a process of gathering and sorting the data. There are different methods to gather data, all of which fall into two categories: primary data source and a secondary data source.
The term primary data refers to the data originated by a researcher himself, while secondary data is the already existing data collected by agencies and organizations for the purpose of conducting an analysis. Primary data sources can include surveys, observations, questionnaires, experiments, personal interviews, and more. The data from ERP (Enterprise Resource Planning) and CRM (Customer Relationship Management) systems can also be used as a primary source of data. On the contrary, secondary data sources can be government publications, staging websites, publications from independent research labs, journal articles, etc. The transformed “raw” data set into another format, in the process of data wrangling, can also be seen as a secondary data source. Secondary data can be a key concept in terms of data enrichment when the primary source data is not solid enough with information, and it can improve the precision of the analysis by adding more attributes and variables to the sampling.
Quantitative And Qualitative Data
Data can be defined by a set of variables with qualitative or quantitative nature.
Qualitative Data
Qualitative data refers to the data that can provide insights and understanding about a particular problem.
Quantitative Data
Quantitative data, as the name suggests is one that deals with quantity or numbers. This numerical data can be determined by categories or so-called classes.
Although both types of data can be considered as separate entities providing different outcomes and information about a sample, it is important to understand that both types are often needed to perform quality analysis. Without knowing why are we seeing a certain pattern in behavioral data, we may try to solve the wrong problem, or the right problem incorrectly. A real-life example would be collecting qualitative data about customer preferences, and quantitative data about the number and the age of customers in order to analyze the level of customer satisfaction and find a pattern or correlation of changing preferences with different customer age groups.
Types of Data Sources
Data can be captured in many different shapes, some may be easier to extract than others. Having data in different shapes requires different storage solutions and should therefore be approached in different ways. At Kantify, we distinguish between three shapes of data: structured data, unstructured data, and semi-structured data.
Structured Data
Structured data is tabular data, containing columns and rows which are very well defined. The main advantage of this type of data is being easily stored, entered, queried, modified, and analyzed. Structured data is often managed by Structured Query Language, or SQL a programming language created for managing and querying data in relational management systems.
Unstructured Data
Unstructured data is the rawest form of any data, and it can be in any type or file: pictures and graphic images, webpages, PDF files, videos, emails, word processing documents, etc. This data is often stored in repositories of files. Extracting valuable information out of this type of data can be somewhat challenging. For example, a text can be analyzed by extracting the topics it covers and whether the text is positive or negative about them.
Semi-Structured Data
As the name implies, semi-structured data is a cross between structured and unstructured data. A semi-structured data may have a consistent defined format, however, the structure may not be very strict. The structure may not be necessarily tabular and parts of the data may be incomplete, or contain differing types. An example can be photos of other graphics tagged with keywords, making it easy to organize and locate graphics.
Historical And Real-Time Data
Historical datasets can help in answering exactly the types of questions that decision-makers would like to benchmark against real-time data. Historical data sources can be best suited for building or modifying predictive or prescriptive models, and offering insights that can improve long-term and strategic decision making. The basic definition or real-time data explains it as a data that is passed along the end-user as quickly as it is gathered. Real-time data can be enormously valuable in things like traffic GPS systems, in bench-marking different kinds of analytics projects and for keeping people informed through instant data delivery.
In predictive analytics, both types of data sources should be given equal consideration, as both can help in predicting and identifying future trends.
Internal and External Data
Internal Data
Internal data is information gathered within an organization and can cover areas such as personnel, operations, finance, maintenance, procurement, and many more. Internal data can provide information on employee turnover, sales success, profit margins, structure and dynamics of an organization, etc.
External Data
External data is the information gathered from outside, including customers, staging websites, agencies, and more. For example, external data gathered from social media can provide insights about the behavior, preferences, and motivations of customers. At this stage, you may wonder if internal data is the same as primary data, and external data the same as secondary data. This is close but slightly different. The categorization of internal and external data sources is mostly in terms of where the data comes from whether it was collected from your organization or from a source outside your organization. The notion of primary/secondary data rather refers to the purpose and time-frame for which the data was collected whether it was collected by the researcher for a precise project, or in the form of another source, even within the same organization.
One thought on “Introduction and Data sources for AI”