- (Topic 2)
As the organization looks to advance its analytics practices, the topic of provisioning access to executive dashboards and visualizations is under discussion. Establishing standards and implementing role based logins to executive dashboards will address:
Correct Answer:
B
According to the Guide to Business Data Analytics, data security is the protection of data from unauthorized access, use, modification, or destruction. Data security includes the policies, procedures, and technologies that ensure the confidentiality, integrity, and availability of data. Data security is an important aspect of data management, which is the planning, execution, and oversight of the data lifecycle. Data security is also related to data governance, which is the establishment and enforcement of rules, roles, and responsibilities for data quality, access, and usage. Data security is not the same as content management, which is the creation, storage, distribution, and maintenance of digital content.
One of the ways to enhance data security is to provision access to executive dashboards and visualizations based on the roles and permissions of the users. This can help prevent unauthorized or inappropriate access to sensitive or confidential data, as well as ensure compliance with data privacy and ethical standards. By establishing standards and implementing role based logins to executive dashboards, the organization can address the data security needs of its analytics practices.
References: Guide to Business Data Analytics, page 52-53; CBDA Exam Blueprint, page 7; Introduction to Business Data Analytics: An Organizational View, page 10.
- (Topic 2)
An analytics team is sourcing data for a new analytics initiative and is deciding between two comparable data sources. One source being considered is a very large dataset and another consists of three smaller sources. What advantage will the larger dataset provide over the three smaller sources?
Correct Answer:
A
A larger dataset may provide more significant results than three smaller sources, as it may have more statistical power to detect differences or relationships among variables1. Statistical power is the probability of finding a statistically significant result when there is a true effect in the population2. A larger dataset may have more power because it may have more variability, less sampling error, and higher precision than smaller datasets3. More significant results may lead to more confident and valid conclusions and recommendations for the analytics initiative.
Higher validity, more reproducibility, and higher reliability are not necessarily advantages of a larger dataset over three smaller sources, as they depend on other factors besides the size of the data. Validity is the degree to which the data and the analysis measure what they are intended to measure4. Reproducibility is the degree to which the data and the analysis can be replicated by another analyst using the same methods and data sources. Reliability is the degree to which the data and the analysis produce consistent results under the same conditions. These qualities may be affected by the quality, accuracy, completeness, and relevance of the data, as well as the appropriateness, transparency, and rigor of the analysis methods. A larger dataset may not be valid, reproducible, or reliable if it has errors, biases, missing values, or irrelevant variables, or if the analysis methods are not suitable, documented, or verified.
References:1: Guide to Business Data Analytics, IIBA, 2020, p. 542: Introduction to Business Data Analytics: A Practitioner View, IIBA, 2019, p. 233: Data Analysis: The Definitive Guide, Tableau, 4: Guide to Business Data Analytics, IIBA, 2020, p. 26. : Introduction to Business Data Analytics: A Practitioner View, IIBA, 2019, p. 25. : Guide to Business Data Analytics, IIBA, 2020, p. 26. : Introduction to Business Data Analytics: An Organizational View, IIBA, 2019, p. 13.
- (Topic 1)
A new dataset describing employee salaries is received by a company. A colleague wonders whether a variable follows a Gaussian distribution. Which of the following plots would demonstrate this?
Correct Answer:
A
A normal probability plot is a graphical technique that can be used to check if a variable follows a Gaussian distribution. It plots the observed values of the variable against the expected values under the normal distribution. If the variable is normally distributed, the points should form a straight line. A scatterplot, a boxplot, and a lowess curve are not suitable for testing normality, as they do not compare the observed values with the theoretical values of the normal distribution. https://www.graphpad.com/support/faq/testing-data-for-normal-distrbution/
- (Topic 1)
A large car manufacturer is interested in comparing the number of sales for a specific model of electric car across all 50 US states.
The data analytics team sourced and acquired the data, and the business analyst created the model to compare sales across states.
In a meeting to review the results, the feedback received included several complaints concerning an inability to distinguish the number of sales per state. What model would result in such confusion?
Correct Answer:
D
A pie chart is a circular chart that shows the proportion of each category in a whole by dividing the circle into slices. A pie chart would result in confusion when comparing the number of sales for a specific model of electric car across all 50 US states, because it is difficult to compare the angles and areas of the slices, especially when there are many categories with similar values. A pie chart also does not show the absolute values of each category, unless they are labeled or annotated12. A better alternative would be a bar chart, which can show the number of sales for each state along a common axis, making it easier to compare and rank the values3. References: 1: Guide to Business Data Analytics, IIBA, 2020, p. 652: Storytelling with Data, Cole Nussbaumer Knaflic, 2015, p. 673: The Visual Display of Quantitative Information, Edward R. Tufte, 2001, p. 178.
- (Topic 2)
An operations manager for a new hotel is in need of determining the optimum number of vans to purchase to shuttle guests to/from the airport. It will be necessary to determine the most efficient routes and schedule to follow to ensure guests do not experience excessive delays. Which business analytics technique would lend itself to supporting these types of business decisions?
Correct Answer:
A
Linear programming is a business analytics technique that can lend itself to supporting these types of business decisions. Linear programming is a mathematical method that optimizes the allocation of limited resources to achieve a desired objective, subject to a set of constraints1. Linear programming can help the operations manager to determine the optimum number of vans to purchase, the most efficient routes and schedule to follow, and the minimum cost or time to shuttle guests to/from the airport, by formulating a linear objective function and a system of linear inequalities that represent the relevant variables, parameters, and restrictions2.
The other options are not correct business analytics techniques for these types of business decisions. Factor analysis is a statistical method that reduces the dimensionality of a large set of correlated variables into a smaller set of uncorrelated factors that explain the underlying structure or patterns of the data3. Factor analysis can help the operations manager to identify the key factors that influence the guest satisfaction or loyalty, but it cannot help to optimize the resource allocation or efficiency. Regression is a statistical method that estimates the relationship between one or more independent variables and a dependent variable. Regression can help the operations manager to predict the demand or revenue of the hotel based on the variables such as season, price, or location, but it cannot help to optimize the resource allocation or efficiency. K-means clustering is a machine learning method that partitions a set of data points into a predefined number of clusters based on the similarity or distance between the data points. K-means clustering can help the operations manager to segment the guests into different groups based on their characteristics or preferences, but it cannot help to optimize the resource allocation or efficiency.
References:1: Guide to Business Data Analytics, IIBA, 2020, p. 532: Introduction to Business Data Analytics: A Practitioner View, IIBA, 2019, p. 93: Guide to Business Data Analytics, IIBA, 2020, p. 54. : Guide to Business Data Analytics, IIBA, 2020, p. 54. : Guide to Business Data Analytics, IIBA, 2020, p. 55. : Guide to Business Data Analytics, IIBA, 2020, p. 53. : Introduction to Business Data Analytics: A Practitioner View, IIBA, 2019, p. 9.
: Guide to Business Data Analytics, IIBA, 2020, p. 54. : Guide to Business Data Analytics, IIBA, 2020, p. 54. : Guide to Business Data Analytics, IIBA, 2020, p. 55.