You want to make your model more frugal to reduce the cost of collecting and processing data.
You plan to do this by removing features that are highly correlated. You would like to create a heat
map that displays the correlation so that you can identify candidate features to remove.
Which Accelerated Data Science (ADS) SDK method is appropriate to display the comparability
between Continuous and Categorical features?
As a data scientist, you have stored sensitive data in a database. You need to protect this data by
using a master encryption algorithm, which uses symmetric keys. Which master encryption
algorithm would you choose in the Oracle Cloud Infrastructure (OCI) Vault service?
You realize that your model deployment is about to reach its utilization limit. What would you do to avoid the issue before requests start to fail?
You want to write a Python script to create a collection of different projects for your data science
team. Which Oracle Cloud Infrastructure (OCI) Data Science interface would you use?
You have created a Data Science project in a compartment called Development and shared it
with a group of collaborators. You now need to move the project to a different compartment called
Production after completing the current development iteration.
Which statement is correct?
You have created a conda environment in your notebook session. This is the first time you are
working with published conda environments. You have also created an Object Storage bucket with
permission to manage the bucket.
Which two commands are required to publish the conda environment?
You are a data scientist with a set of text and image files that need annotation, and you want to use Oracle Cloud Infrastructure (OCI) Data Labeling. Which of the following THREE an-notation classes are supported by the tool.?
Six months ago, you created and deployed a model that predicts customer churn for a call center. Initially, it was yielding quality predictions. However, over the last two months, users have been questioning the credibility of the predictions. Which TWO methods customer churn would you employ to verify the accuracy of the model?
You are a data scientist working inside a notebook session and you attempt to pip install a
package from a public repository that is not included in your conda environment. After running this
command, you get a network timeout error.
What might be missing from your networking configuration?
As a data scientist, you create models for cancer prediction based on mammographic images.
The correct identification is very crucial in this case. After evaluating two models, you arrive at the
following confusion matrix.
Model 1 has Test accuracy is 80% and recall is 70%.
• Model 2 has Test accuracy is 75% and recall is 85%.
Which model would you prefer and why?
You are preparing a configuration object necessary to create a Data Flow application. Which THREE parameter values should you provide?
During a job run, you receive an error message that no space is left on your disk device. To solve the problem, you must increase the size of the job storage. What would be the most effi-cient way to do this with Data Science Jobs?
You have built a machine model to predict whether a bank customer is going to default on a
loan. You want to use Local Interpretable Model-Agnostic Explanations (LIME) to understand a
specific prediction. What is the key idea behind LIME?
You are a data scientist leveraging the Oracle Cloud Infrastructure (OCI) Language AI service for
various types of text analyses. Which TWO capabilities can you utilize with this tool?
You are asked to prepare data for a custom-built model that requires transcribing Spanish video
recordings into a readable text format with profane words identified.
Which Oracle Cloud service would you use?
You have just received a new data set from a colleague. You want to quickly find out summary information about the data set, such as the types of features, total number of observations, and data distributions, Which Accelerated Data Science (ADS) SDK method from the AD&Dataset class would you use?
You have an embarrassingly parallel or distributed batch job on a large amount of data that you
consider running using Data Science Jobs. What would be the best approach to run the workload?
You are a data scientist building a pipeline in the Oracle Cloud Infrastructure (OCI) Data Science
service for your machine learning project. You want to optimize the pipeline completion time by
running some steps in parallel. Which statement is true about running pipeline steps in parallel?
You train a model to predict housing prices for your city. Which two metrics from the
Accelerated Data Science (ADS) ADSEvaluator class can you use to evaluate the regression model?
Which of the following TWO non-open source JupyterLab extensions has Oracle Cloud In-frastructure (OCI) Data Science developed and added to the notebook session experience?