Total concepts: 98
Ensuring that all individuals, regardless of their abilities or contexts, can interact with data and analysis tools. This involves creating inclusive interfaces, understandable formats, and open data.
An algorithm can be understood as a set of instructions or a kind of recipe for carrying out a process. When applied to computers, an algorithm tells a computer how to perform a certain task. So, every time you give it the command to do something, the computer will know what steps to follow.
It is the place where raw data is stored before being processed into more manageable formats.
A technique that uses historical data, algorithms, and statistical models to forecast future events or trends. It helps in making anticipatory decisions.
The process of collecting, measuring, and analyzing data on user behavior on a website. It helps optimize user experiences, identify trends, and make strategic decisions.
Machine learning is a type of artificial intelligence whereby machines or computer algorithms that "learn" to improve through experience.
A subfield of artificial intelligence (AI) and machine learning (ML) that uses artificial neural networks inspired by the structure of the human brain to process large amounts of data and solve complex problems. It is widely used in various applications due to its ability to learn hierarchical representations of data.
A machine learning technique in which a model is trained with labeled data to make predictions or classifications. For example, identifying emails as "spam" or "not spam" based on previously classified examples.
The organization and logical structuring of content and data to facilitate access and understanding. It is applied in web design, applications, and complex systems, achieving that clear menus and well-defined categories enhance the user experience.
The process of using tools and technologies to collect, process, and analyze data without manual intervention. It improves efficiency and reduces errors, such as automating daily sales report updates.
Some unstructured data, such as full documents, texts or videos, are often stored in NoSQL databases. These are databases designed with different types of data in mind. A NoSQL database is non-relational, which means that it stores data that cannot be displayed in table form.
Big data is a field that works on solutions for collecting, transporting and storing very large amounts of data.
High quality data is accurate, complete, consistent and valid. The better its quality, the greater the likelihood of obtaining valuable information from it.
The ability of a system or design to quickly adapt to user needs, such as automatically adjusting to the screen size.
A data catalog is a metadata management tool that companies use to inventory and organize data in their systems. Typical benefits include improvements in data discovery, governance and information access.
An interface that uses natural language to automate tasks, answer questions, or explore complex data.
A set of practices and technologies designed to protect systems, networks, and data from attacks or unauthorized access.
It can be described as the study of data. Data scientists conduct experiments and research, committed to solving problems and finding answers. They piece together different data, dissect patterns, look for anomalies, generate charts and graphs, explore machine learning and artificial intelligence. If data mining is about extracting value, data science is about generating value.
An open-source platform for efficiently managing and publishing open data. It facilitates the organization, access, and visualization of datasets, promoting transparency and data sharing.
A CMS is software used to manage the content of a website.
Technology that enables access to storage, processing, and applications over the internet without relying on local infrastructure.
They are collections of data with a shared theme. Frequently, people look for data sets for their research. They are practical because when analyzed as a whole, they provide the full context of a problem.
When used in relation to databases, the word "query" has a very similar meaning to "search". If someone performs a query on a database, it means that they have searched for a specific piece of data or set of data.
A symbol or area marked and delimited on a map that shows the distribution of some property.
CSV stands for "comma separated values". A CSV file, as the name implies, divides data with commas. This makes it easy to export them to tables such as spreadsheets, as the comma delimitation of the data gives them their own "field".
It can be understood from two perspectives. The first has to do with an organization's ability to use data to make data-driven decisions. The second is about the linkage and connections between culture, art and data. In either case, it speaks to the appropriation of data for conscious action.
A spline is a mathematical function that generates a smooth and continuous curve from data points, fitting them precisely. It is used in computer graphics, geometric modeling, and data analysis to connect points with seamless transitions. It enables the creation of complex shapes with control over the curve while minimizing jumps between segments. Tools like PowerPoint or Illustrator make it easy to visualize with "curve line" options.
Open data is data that can be freely accessed, distributed and copied by anyone. It is "public" data and, as such, is not protected by intellectual property rights.
Categorical data is data that can be divided into groups or categories.
Unstructured data is usually qualitative data and is often stored in NoSQL databases. They are useful, but sometimes not so practical for analysis and information generation purposes, as they cannot be visualized well in analysis tools such as graphs and tables. Some data of this type are video, audio or satellite images.
Discrete data is a type of quantitative data that includes numbers and statistics from individual, non-divisible data points that can be counted. Discrete data points are usually written as numbers that represent exact values, and discrete data usually represent single events that have already occurred.
Raw data is data that has not been processed or transformed in any way. It is data that has been taken directly from the source.
In most cases, structured data are quantitative data. They are easy to organize in spreadsheets, relational databases and to visualize. Examples include names, order numbers and geolocation. It is easier to generate information from structured data than from unstructured data.
Numerical data is a type of data expressed in numbers. It is sometimes referred to as quantitative data and is differentiated from other types of data in the form of numbers by its ability to perform arithmetic operations on these numbers.
Sometimes, those who produce and manage data or information for public consumption are people in positions of power who decide what information to show. Increasingly, however, anyone can create, use and make decisions from data, which is known as data democratization.
Back-end programmers specialize in the "behind the scenes" of a website or software. They deal with how things work on the inside, they create the components that the user accesses through the front-end application.
Some programmers specialize in creating the "front-end" or graphical interface of a website or software that users interact with. While back-end developers focus on building the components and functions that make a website or software work, front-end developers build the applications that allow users to access these components.
A full-stack developer can work on both back-end and front-end development, so they have an overview of all aspects of building a website or software.
In statistics, it is a means of describing the degree of distribution of data around a central value or point. It helps to understand the distribution of data. A smaller spread indicates greater accuracy in manufacturing process or data measurements, while a larger spread means less accuracy.
A .docx file is a document file in Microsoft Word's open XML format. They are smaller and easier to support than .doc files because the format is XML-based and all content is stored as separate files, and eventually compacted into a single ZIP compressed file.
To integrate or embed content, such as videos, graphics, or applications, within a webpage or another platform. It allows viewing external resources without leaving the current environment.
It is a numerical value that has been calculated that characterizes some aspect of a sample data set. It usually serves to estimate the true value of a corresponding parameter in an underlying population.
User Experience (UX) aims to understand how users react to and feel about specific digital products, such as websites or applications. UX designers, using user-centered methodologies, create interfaces designed to maximize interaction and encourage user engagement.
It is the process by which data is taken from one source and moved into a larger container - or database - with lots of other data. Its name describes the process: data is taken ("extracted") from a source, converted ("transformed") into a uniform format and placed ("loaded") into a larger store. This process seeks to facilitate the manipulation of the data and its storage in a logical way, in order to facilitate its use.
Data management describes the process of collecting, cataloging and processing data within an organization to achieve a certain outcome. More and more organizations are adopting "data management systems" to simplify data processes. These data management systems aim to make data management an everyday activity for non-technical staff.
A set of standards, processes, and institutional arrangements that manage the responsible use of data to maximize its value without compromising rights. It involves coordination among entities, adoption of common standards, and consultation with stakeholders, balancing openness, protection, innovation, and regulation.
Open government is a governance model that recognizes that citizens have the right to access government documents and procedures. The concept has a broad scope, but is often linked to the ideas of access to information, participation, accountability, innovation, government coordination, integrity, civic engagement, budget transparency and anti-corruption.
It is a meeting between developers, data scientists and other related profiles, in which they work intensively on a specific project for a specific period of time. They can arise to devise solutions for specific projects of the organization or even to solve global problems.
PII is the name given to any information related to a specific individual. It can be very basic information, such as a name or number, or very sensitive information, such as bank details or medical records. Because PII can say something about private individuals, it is often regulated by data protection legislation.
Systems and resources required to collect, store, process, and share data.
A data engineer creates structures to host and connect data. In order for a data scientist to analyze large data sets, he or she first needs a data engineer to build the mechanisms necessary to collect and process this data.
Valuable knowledge or conclusion derived from the analysis of data or information. It helps understand trends, solve problems, or make strategic decisions.
Data integration is an aspect of data management that focuses on bringing together data from many different sources. Integrating data properly minimizes the margin of error in all data-driven decisions an organization makes.
Technology that simulates human capabilities such as learning, reasoning, or perception, used in applications like chatbots, virtual assistants, or recommendation systems.
An API can be thought of as a messenger. It goes back and forth between two applications, receiving a request and returning a response.
The user interface is the part of a software that allows the user to interact directly with it. A website or a microsite are common user interfaces that may contain numerous interactive elements, drop-down menus or progress bars.
More and more household devices are connecting to the Internet, giving rise to the concept of the Internet of Things (IoT): the integration of everyday objects with the network. In homes, this technology aims to make users' lives easier, while on a larger scale, it drives the development of smart cities by optimizing urban services.
Ability of digital systems and services to exchange, understand, and use data in a fluid and standardized way. It is a fundamental principle in technological development that allows different platforms, applications, and databases to communicate efficiently with each other, regardless of their architecture or provider.
The .jpeg or .jpg file extensions are used in image files compressed to the Joint Photographic Experts Group (JPEG) standard. They support up to 24-bit color and use lossy compression, which can significantly reduce image quality if high amounts are applied.
JavaScript Object Notation (JSON) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute-value pairs and arrays.
SQL programming language allows you to discover, edit or delete data found in a relational database management system.
Programming languages allow humans to interact with computers in terms that both parties can understand and interpret. Some of the most common ones are Python, JavaScript, C#, C++ and C.
The process of preparing data for proper use and analysis. It includes tasks such as correcting missing values, adjusting formats, and verifying consistency, ensuring reliable and accurate data.
The result obtained by adding two or more quantities and dividing the total by the number of quantities. Some characteristics of the mean are that it considers all values, the numerator in the formula is the total number of values, and when there are extreme values, it may not provide an accurate representation of the sample.
Median is the middle value of the given list of data, when arranged in an order.
Metadata is data about data. For one piece of data, there is usually a lot of other metadata, that is, pieces of information that describe that data. A good example is a document on your computer. The document itself is the data, and information such as the time and date of creation, file size, and storage location are the metadata.
Simple data, easy for humans to process and understand. They are relevant for solving specific or local problems, such as purchase records in a small store.
A webpage or set of pages dedicated to a specific topic, campaign, or project, separate from the main website. It can be external, such as a promotional site, or internal, designed for communication and resources within an organization, such as training or internal updates.
The process of transferring data from one system or format to another, ensuring its integrity and compatibility. It is key in technological upgrades or platform changes.
The goal of the data mining process is to extract as much value and "usable" information as possible from the raw data.
It is the value that appears most frequently in a set or group of values. It is possible for a group of values to have two or more modes, or for there to be none. It is a very clear sample, and the values presented can be either quantitative or qualitative.
Artificial intelligence algorithms trained on large volumes of text to understand and generate human language coherently. They are useful in tasks such as writing, translation, and text analysis.
A file with the .pdf extension is a Portable Document Format (PDF) file. It was created with two goals in mind, that people could open the documents on any hardware or operating system, without needing the application used to create them, and that the layout of the document would be retained when opened.
A human-centered methodology for solving problems creatively. It focuses on understanding user needs, fostering empathy, ideation, and experimentation. It aims to incorporate unique perspectives that resonate with users and create value in the product or service.
It is a journalism discipline that investigates through the collection and analysis of data, usually in large quantities, using specialized software. This type of journalism makes information understandable to the audience through narrative, articles, visualizations or interactive applications.
The Portable Network Graphics (PNG) extension is a raster graphics file format that performs lossless compression of the image. It was designed as an enhancement to the Graphics Interchange Format (GIF) and is not patented.
They are online user interfaces that allow users to access open data collections. Two of the most common types of organizations that publish data through open data portals are governments and research organizations.
A branch of artificial intelligence that enables machines to understand, interpret, and generate human language. It is used in tasks such as machine translation and chatbots.
Ratio, in general, refers to a part, share or number considered in comparative relation to a whole. If two given sets of numbers increase or decrease in the same ratio, then the ratios are said to be directly proportional to each other. 2/4 is proportional to 4/8 and 1/2.
A technology that converts printed or handwritten text into digitized, editable text. It is useful for digitizing documents, extracting information, and automating processes. Example: scanning invoices and converting them into text for accounting analysis.
Data scraping is a way of getting data from a website into a local file on the computer, such as a spreadsheet.
Data management systems enable organizations to maximize the information they can get from their data. They can collate all their data, collect data from sources outside their organization, query the data in their database to identify or discover new information.
Sometimes organizations choose to subscribe to software services instead of installing expensive and complex software. These services are usually available online and are not tied to a specific device. An added advantage of SaaS is that you only pay for what you consume, as if the software were a service fee.
Open source software is free and not restricted by copyright. Its code can be viewed and modified by anyone with programming knowledge.
Proprietary software (also known as closed source software) is not free to use and is protected by intellectual property rights, such as copyrights. Unlike open source software, proprietary software does not allow end users to view or modify the source code.
A technique to communicate data findings clearly and persuasively by combining visualizations, narratives, and context. It helps transform complex data into understandable and actionable messages.
An interactive visual interface that centralizes and presents key data through charts, maps, and dynamic tables, allowing users to filter, explore, and analyze information.
Digital technologies and tools that enhance citizen participation, improve public services, and promote transparency. For example, platforms to track public budgets or report issues in urban transportation.
A software development method that uses visual tools and pre-built components to create applications with minimal coding. It accelerates development and reduces costs.
In computing, plain text is referred to as text that has no formatting whatsoever.
The process of replacing sensitive data with unique identifiers or tokens that have no value outside of their context. It protects information such as credit card numbers or personal data.
A principle that ensures the clear explanation of how algorithms function, decisions are made, and data is used.
A TXT file is a standard text document containing plain or unformatted text. It can be opened and edited in any editing or word processing program.
It is the measure of the amount of variation or dispersion that a data set has. It shows, on average, how far each value is from the mean.
Combination of maps and data to analyze phenomena with a spatial dimension.
WebP is a format developed by Google. It is based on the VP8 video codec and offers rich, high-quality images in a smaller size than PNG or JPEG.
An XLS file is a spreadsheet file created by Microsoft Excel or exported by another spreadsheet program. It contains one or more spreadsheets, which store and display data in table format. XLS files can also store mathematical functions, charts, styles and formatting.
A file with the XLSX extension is an XML-formatted spreadsheet file opened exclusively by Microsoft Excel. It is a ZIP-compressed, XML-based file created by Microsoft Excel version 2007 and later.