Linear Algebra

Q1 A model consists of one or more equations. The quantities appearing in the model we classify into ___?

A1 variables and parameters

Q2 Spearman is a ____ test as it is computed from the order of the data regardless of the actual values

A2 nonparametric

Q3 Pearson is a ___ test as it is computed directly from the data and can be used to derive a mathematical relationship

A3 parametric

Python

Q1 ____ are Pythons abstraction for data

A1 objects

Q2 All data in a Python program is represented ______

A2 by objects or by relations between objects

Q3 Every object has a ____ (Python)

A3 identity, type, value

Q4 An object’s ____ never changes once it’s been created; you may think of it as the object’s address in memory (Python)

A4 identity

 

Hadoop

Q1 ___ provides tools and APIs to provision, manage, and monitor Hadoop clusters

A1 Apache Ambari

Q2 ____ is a platform for doing data science and data engineering at high scale in the cloud

A2 Mortar

Q3 A tuple is made of ___ which sometimes might be called columns (Pig)

A3 felds

Q4 After performing a ___ operation, the new alias will have only two fields: one that corresponds to what was being grouped on and one field that contains all the data rows for any single group value (pig)

A4 GROUP

Q5 each distinct group value has its own list of corresponding data. In Pig, this is called a ____

A5 DataBag

Q6 Command to output the schema of the alias (Pig)

A6 DESCRIBE alias

Q7 The authors of Hadoop addressed the need to grow Hadoop beyond MapReduce architecturally by decoupling the ___ features built in to MapReduce from the programming model of MapReduce

A7 resource management

Q8 Under YARN, ____ is one type of available application running in a YARN container

A8 MapReduce

Q9 First-In First-Out, Fair Share, Time Based, preemption are YARN ___ options

A9 job scheduling policy

 

Linear Algebra

Q1 The ___ of a square matrix are the non-zero vectors that, after being multiplied by the matrix, remain parallel to the original vector

A1 eigenvectors

Q2 The eigenvectors of a _____ are the non-zero vectors that, after being multiplied by the matrix, remain parallel to the original vector

A2 square matrix

Q3 The eigenvectors of a square matrix are the non-zero vectors that, after being multiplied by the matrix, remain ____

A3 parallel to the original vector

Q4 For each eigenvector, the corresponding ____ is the factor by which the eigenvector is scaled when multiplied by the matrix

A4 eigenvalue

Q5 For each eigenvector, the corresponding eigenvalue is the factor by which the eigenvector is scaled when ____

A5 multiplied by the matrix

Q6 If A is a square matrix, a non-zero vector V is an eigenvector of A if there is a scalar λ such that ___

A6 Av = λv

Q7 If A is a square matrix, a non-zero vector V is an ____ of A if there is a scalar λ such that Av = λv

A7 eigenvector

Q8 If Av = λv, the scalar λ is said to be the eigenvalue of ____

A8 A corresponding to V

Q9 An ____ of A is the set of all eigenvectors with the same eigenvalue together with the zero vector

A9 eigenspace

Q10 An eigenspace of A is the set of all eigenvectors with the same eigenvalue together with the zero vector

A10 Set of all eigenvectors with the same eigenvalue

Q11 The prefix eigen is adopted from the German word “eigen” for ___, in the sense of a characteristic description

A11 own

Q12 Matrix A acts by stretching vector X, not ___ so X is an eigenvector of A

A12 changing its direction

Apache

Q1 What are the two different sets of environment variables referred to in Apache docs?

A1 Those controlled by the OS, those controlled by Apache server

 

NumPy

Q1 Familiar mathematical functions such as sin, cos, and exp are called ____ within NumPy

A1 “universal functions” (ufunc)

Q2

A2 elementwise

Q3 in NumPy, universal functions operate elementwise on an array, producing ___

A3  an array as output

Q4 Find square root of array B (NumPy)

A4 sqrt (B)

Q5 sqrt (B) – this will produce what – B is array (NumPy)

A5 an array in which each element is the square root of the array B element

Q6 How are two arrays coded in NumPy (add(B,C))

A6 element B[0] + element C[0], element B[1] + element C[1]

Q7 ___ allows universal functions to deal in a meaningful way with inputs that do not have exactly the same shape (NumPy)

A7 broadcasting

Q8 broadcasting allows ____ to deal in a meaningful way with inputs that do not have exactly the same shape (NumPy)

A8 universal functions

Q9 broadcasting allows universal functions to deal in a meaningful way with inputs that do not have exactly the same ___ (NumPy)

A9 shape

Q10 The first rule of ___ is that if all input arrays do not have the same number of dimensions, a “1″ will be repeatedly prepended to the shapes of the smaller arrays until all the arrays have the same number of dimensions

A10 broacasting (NumPy)

Q11 When the indexed array a is multidimensional, a single array of indices refers to the ___ of a

A11 first dimension

Q12  When the indexed array a is multidimensional, a ____ refers to the first dimension of a

A12 single array of indices

Q13 A ___ is a vector of non-negative integers (R)

A13 dimension vector

Q14 A dimension vector is a vector of ____

A14 non-negative integers

Q15 If the ___ length is K, then the array is K dimensional (R)

A15 dimension vector

Q16 A vector can be used by R as an array only if it has a dimension vector as its ____ attribute

A16 dim

Q17 > dim(z) <- (c(3,5,100) – gives z the dim attribute that allows it to be treated as a ? (R)

A17 3 by 5 by 100 array

Q18 for any array, say z, the dimension vector may be referenced explicitly as ____ (R)

A18 dim(z)

 

 

Linear Algebra

Q1 Variation within a set indicates a higher ____ while a set with all identical members has very low or zero ____

A1 entropy

Q2 In a decision tree we want entropy to ____ as we keep splitting

A2 decrease

Q3 When a decision tree is reversed, the increase in entropy as groups are combined is called ____

A3 information gain

Q4 When modelling data sets are split into three sets: ____

A4 the training set, the validation set, the test set

Q5 If our model does poorly on the test set, why can’t we adjust it?

A5 That would be overfitting to the test set

Q6 What do we do if a model performs poorly on a test set?

A6 Start over creating a new training, cross validation and test set

Q7

a11x1 + a12x2…+a1nxn = b1

a21x1 + a22x2…+a2nxn = b2

…     …     …    …

am1x1 + am2x2 + … annxn = bn

x1, x2, …, + xn are the ____

A7 unknowns

Q8 a11, a12, … amn are the ____

A8 coefficients

Q9 b1,b2, …bm are the ____

A9 constants

Q10 In a linear system each unknown is a weight for a column vector in a ___

A10 linear combination

Q11 In a linear system each unknown is a weight for a ___ in a linear combination

A11 column vector

Q12 In a linear system each ___ is a weight for a column vector in a linear combination

A12 unknown

Q13 In a linear system, the collection of all possible linear combinations of vectors on the left-hand side is called their ____ and the equations have a solution just when the right hand vector is with that ___ . If every vector with that span has exactly one expression as a linear combination of the given left hand vectors, then any solution is unique.

A13 span

Q14 In a linear system, the collection of all possible linear combinations of vectors on the left-hand side is called their span and the equations have a solution just when the right hand vector is with that span . If every vector with that span has exactly one expression as a linear combination of the given left hand vectors, then any solution is ____.

A14 unique

Q15 In a linear system, the collection of all possible linear combinations of vectors on the left-hand side is called their span and the equations have a solution just when the right hand vector is with that span. If every vector with that span has exactly ____ as a linear combination of the given left hand vectors, then any solution is unique.

A15 one expression

Q16 The number of vectors in a basis (its dimension) cannot be ____ than m or n but it can be smaller

A16 larger

Q17 If we have m independent vectors, a solution is ____ regardless of the right hand side, otherwise not

A17 guaranteed

Q18 In an underdetermined system, the dimension of the solution set it usually equal to ___

A18 n – m where n is the number of variables and m is the number of equations

Q19  When a linear system is inconsistent it is possible to derive a ___ from the equations that may always be rewritten as 0 = 1

A19 contradiction

Q20 When a linear system is inconsistent it is possible to derive a contradiction from the equations that may always be rewritten as ____

A20 0 = 1

Q21 A system of equations whose left hand sides are linearly independent is always ___

A21 consistent

Q22 A system of equations whose left hand sides are ____ is always consistent

A22 linearly independent

Q23 A system of equations whose left-hand sides are linearly independent is always consistent.

A23 left-hand sides

Q24 an explicit formula for the solution of a system of linear equations with each variable given by a quotient of two determinants

A24 Cramer’s Rule

Q25 Cramer’s Rule is an explicit formula for the solution of a system of linear equations with each variable given by a quotient of two

A25 determinants

Q26 an explicit formula for the solution of a system of linear equations with each variable given by a  of two determinants

A26 quotient

Q27 ndarray.dtype – what’s this show? (NumPy)

A27 the data type each element of the array is comprised of

Q28 b = array ([(1.5,2,3), (4,5,6,)])

A28 two

Q29 The type of the array can also be explicitly specified ____

A29 at creation time

Q30 Often the elements of an array are originally unknown but its size is known. Hence NumPy offers several functions to create arrays with initial ____ content

A30 placeholder

Q31 The function zeros creates ____? (NumPy)

A31 an array full of zeroes

Q32 The function ____ creates an array whose initial content is random and depends on the state of memory

A32 empty

Q33 Gauss completed Disquisitiones Arithmeticae, his magnus opus, in 1798 at the age of 21. This work was fundamental in consolidating ____ as a discipline

A33 number theory

Q34 If you start typing in the console or editor and hit the ___ key, Rstudio will suggest functions or file names

A34 tab

Q35 Start typing in the Rstudio interactive console (not the code editor window) and hit ____ to show a list of every command you’ve typed starting with those keys

A35 ctrl + up

Q36 Press ____ in Rstudio to execute the current line of code

A36 ctrl + enter

Q37 Set the working directory R

A37 setwd() – dir path in (), replace \ with /

Q38 command for installing R package

A38 install.packages(“thepackagename”)

Q39 See installed R packages

A39 installed.packages()

Q40 to use an R package in your work once it’s installed, load it with ____

A40 library(“thepackagenmame”)

Q41 Update R packages

A41 update.packages()

Q42 help with a function in R

A42 ?functionName

Q43 Show an example of a function in use (R)

A43 example(functionname)

Q44 Display a list of function’s arguments

A44 args(functionName)

Q45 Search R help for a term

A45 ?? (“term”)

Q46 assignment in R can be done using the function ___

A46 assign() eg. assign(“x”, c(10.4,5.6))

Q47 What’s wrong with this (R)

c(10.4,5.6) -> x

A47 nothing

Q48 Vectors can be used in arithmetic expressions in which case operations are performed ____

A48 element by element

Q49 If vectors in the same expression are not the same length, the value of the expression is ?

A49 A vector with the same length as the longest vector which occurs in the expression

Q50 y has 11 elements, x has 5 elements

v <- 2*x + y + 1 generates a new vector of length ___ (R)

A50 11

Q51

y has 11 elements, x has 5 elements

v <- 2*x + y + 1 generates a new vector of length 11

is constructed by adding 2*x ___ times

A51 2.2

Q52

y has 11 elements, x has 5 elements

v <- 2*x + y + 1 generates a new vector of length 11

constructed by repeating y ___ times

A52 1

Q53 y has 11 elements, x has 5 elements

v <- 2*x + y + 1 generates a new vector of length 11

constructed of 1 repeated __ times

A53 11

Q54 sort(x) – returns what? (R)

A54 a vector of the same size as x with the elements arranged in increasing order

Q55 1:30 creates ? (R)

A55 c(1,2,…,29,30)

Q56 30:1 creates ? (R)

A56 c(30,29,…,2,1)

Q57 A vector can be used by R as an array only if it has a dimension vector as its ___ attribute

A57 dim

Q58 if the dimension vector for an array a is c(3,4,2) then there are ___ entries in a? (R)

A58 3*4*2 = 24

 

Linux

Q1 the variable ___ contains the status of the last executed commnd

A1 $?

Q2 The variable $? contains ____?

A2 the status of the last executed command

Q3 With a ___ option, ls gives you a long listing, that includes the owner and size and date of the file and the permissions people have for reading (or changing) the file

A3 -l

Q4 All files given as parameters are concatenated and sent to standard output

A4 cat

Q5 deletes the file given as a parameter

A5 rm

Q6 finds occurrences of a string  in one or more files

A6 grep

Q7 prints the current directory

A7 pwd

Q8 find files with given name or other properties

A8 find

Q9 tells you how much disk space is free

A9 df

Q10 Show you which processes are active and what numbers these processes have

A10 ps

Linear Algebra

Q1 ___ can be understood as a 1 dimensional indexed array (Pandas)

A1 series

Q2 The essential difference between an Excel workbook and a Pandas dataframe is that column names and rownames are known as column and row ____

A2 index

Q3 Series and dataframes form the core data model for ___

A3 Pandas

 

What are we searching for?