AIP-210 CertNexus Certified Artificial Intelligence Practitioner (CAIP) exact Exam Questions

Question # 4

You and your team need to process large datasets of images as fast as possible for a machine learning task. The project will also use a modular framework with extensible code and an active developer community. Which of the following would BEST meet your needs?

Caffe

Keras

Microsoft Cognitive Services

TensorBoard

Full Access

Question # 5

For a particular classification problem, you are tasked with determining the best algorithm among SVM, random forest, K-nearest neighbors, and a deep neural network. Each of the algorithms has similar accuracy on your data. The stakeholders indicate that they need a model that can convey each feature's relative contribution to the model's accuracy. Which is the best algorithm for this use case?

Deep neural network

K-nearest neighbors

Random forest

SVM

Full Access

Question # 6

Which two of the following criteria are essential for machine learning models to achieve before deployment? (Select two.)

Complexity

Data size

Explainability

Portability

Scalability

Full Access

Question # 7

When should you use semi-supervised learning? (Select two.)

A small set of labeled data is available but not representative of the entire distribution.

A small set of labeled data is biased toward one class.

Labeling data is challenging and expensive.

There is a large amount of labeled data to be used for predictions.

There is a large amount of unlabeled data to be used for predictions.

Full Access

Question # 8

Which of the following is a common negative side effect of not using regularization?

Overfitting

Slow convergence time

Higher compute resources

Low test accuracy

Full Access

Question # 9

Which of the following metrics is being captured when performing principal component analysis?

Kurtosis

Missingness

Skewness

Variance

Full Access

Question # 10

In which of the following scenarios is lasso regression preferable over ridge regression?

The number of features is much larger than the sample size.

There are many features with no association with the dependent variable.

There is high collinearity among some of the features associated with the dependent variable.

The sample size is much larger than the number of features.

Full Access

Question # 11

Your dependent variable Y is a count, ranging from 0 to infinity. Because Y is approximately log-normally distributed, you decide to log-transform the data prior to performing a linear regression.

What should you do before log-transforming Y?

Add 1 to all of the Y values.

Divide all the Y values by the standard deviation of Y.

Explore the data for outliers.

Subtract the mean of Y from all the Y values.

Full Access

Question # 12

A company is developing a merchandise sales application The product team uses training data to teach the AI model predicting sales, and discovers emergent bias. What caused the biased results?

The AI model was trained in winter and applied in summer.

The application was migrated from on-premise to a public cloud.

The team set flawed expectations when training the model.

The training data used was inaccurate.

Full Access

Question # 13

Which of the following options is a correct approach for scheduling model retraining in a weather prediction application?

As new resources become available

Once a month

When the input format changes

When the input volume changes

Full Access

Question # 14

Your dependent variable data is a proportion. The observed range of your data is 0.01 to 0.99. The instrument used to generate the dependent variable data is known to generate low quality data for values close to 0 and close to 1. A colleague suggests performing a logit-transformation on the data prior to performing a linear regression. Which of the following is a concern with this approach?

Definition of logit-transformation

If p is the proportion: logit(p)=log(p/(l-p))

After logit-transformation, the data may violate the assumption of independence.

Noisy data could become more influential in your model.

The model will be more likely to violate the assumption of normality.

Values near 0.5 before logit-transformation will be near 0 after.

Full Access

Question # 15

Which of the following approaches is best if a limited portion of your training data is labeled?

Dimensionality reduction

Probabilistic clustering

Reinforcement learning

Semi-supervised learning

Full Access

Question # 16

Which of the following describes a typical use case of video tracking?

Augmented dreaming

Medical diagnosis

Traffic monitoring

Video composition

Full Access

Question # 17

Workflow design patterns for the machine learning pipelines:

Aim to explain how the machine learning model works.

Represent a pipeline with directed acyclic graph (DAG).

Seek to simplify the management of machine learning features.

Separate inputs from features.

Full Access

Question # 18

Which of the following items should be included in a handover to the end user to enable them to use and run a trained model on their own system? (Select three.)

Information on the folder structure in your local machine

Intermediate data files

Link to a GitHub repository of the codebase

README document

Sample input and output data files

Full Access

Question # 19

Which of the following is the correct definition of the quality criteria that describes completeness?

The degree to which all required measures are known.

The degree to which a set of measures are equivalent across systems.

The degree to which a set of measures are specified using the same units of measure in all systems.

The degree to which the measures conform to defined business rules or constraints.

Full Access

Question # 20

A healthcare company experiences a cyberattack, where the hackers were able to reverse-engineer a dataset to break confidentiality.

Which of the following is TRUE regarding the dataset parameters?

The model is overfitted and trained on a high quantity of patient records.

The model is overfitted and trained on a low quantity of patient records.

The model is underfitted and trained on a high quantity of patient records.

The model is underfitted and trained on a low quantity of patient records.

Full Access

Question # 21

Which of the following principles supports building an ML system with a Privacy by Design methodology?

Avoiding mechanisms to explain and justify automated decisions.

Collecting and processing the largest amount of data possible.

Understanding, documenting, and displaying data lineage.

Utilizing quasi-identifiers and non-unique identifiers, alone or in combination.

Full Access

Question # 22

Which two encodes can be used to transform categories data into numerical features? (Select two.)

Count Encoder

Log Encoder

Mean Encoder

Median Encoder

One-Hot Encoder

Full Access

Question # 23

When working with textual data and trying to classify text into different languages, which approach to representing features makes the most sense?

Bag of words model with TF-IDF

Bag of bigrams (2 letter pairs)

Word2Vec algorithm

Clustering similar words and representing words by group membership

Full Access

Question # 24

A data scientist is tasked to extract business intelligence from primary data captured from the public. Which of the following is the most important aspect that the scientist cannot forget to include?

Cyberprotection

Cybersecurity

Data privacy

Data security

Full Access

Question # 25

A change in the relationship between the target variable and input features is

concept drift.

covariate shift.

data drift.

model decay.

Full Access

Question # 26

Which of the following is a type 1 error in statistical hypothesis testing?

The null hypothesis is false, but fails to be rejected.

The null hypothesis is false and is rejected.

The null hypothesis is true and fails to be rejected.

The null hypothesis is true, but is rejected.

Full Access

Question # 27

You have a dataset with many features that you are using to classify a dependent variable. Because the sample size is small, you are worried about overfitting. Which algorithm is ideal to prevent overfitting?

Decision tree

Logistic regression

Random forest

XGBoost

Full Access

Summer Sale Special 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: ex2p65

Exact2Pass Menu

Exact2Pass

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

SubFooter