Data Analysis

[Complete] Data Analysis : Sheet1


	Business Intelligence	It offers a way to examine trends from collected data and derive insights from it.
	48th	On an examination given to 1000 students, Jef’s score of 80 was higher than the score of 480 students who took the exam. What is the percentile for Jef’s score?
	Correlation	It refers to the degree of relationship between two variables?
	likehood	To estimate the parameters of the model ,the ________function is maximized.
	Firth	He proposed the use of a penalized likehood function.
	text mining	It expands available data enormously.
	roles	Which is NOT a KR technology?
	Standard	The normal distribution with a mean of 0 and standard deviation of 1.
	network topology_ _	A network purpoting to describe family memberships.
	null	Another term for an empty set.
	graphs	The following are elements in an analytic plan EXCEPT
	Medium for pragmatically diligent interpretation	The following are distinct roles that KR plays EXCEPT
	histogram	A graph that is used to indicate frequency distribution.
	MYCIN	It sees the medical world as made of empirical associations connecting symptoms to diseases.
	hidden	The constant multiplicative factor in which algorithms are related are_______ constants.
	One	The integral of all the values of a random variable in a probability density function is equal to______.
	logistic regression_ _	It refers to a frequently used method as it enables binary or polytomous variables to be modelled.
	Regression	The equation of the _______line predicts the value of Y given X.
	λ	The symbol used to indicate strings with no elements.
	invertible	Matrix B is
	Cluster analysis	_____________ includes identifying groups of data record.
	relative frequency distribution	It list the percent of data in a distribution.
	multinomial legit model	A model that corresponds to the case where the dependent variable has more than two categories.
	SPSS	The following are softwares used in data mining EXCEPT
	rule-based	It views the world in terms of attribute -object value triples
	chi-square	The following are discrete distributions EXCEPT
	Data mining	It is used to discover patterns in large data sets
	space complexity	It relates the length of an algorithm to the number of storage location it uses.
	poison probability distribution	It is often used as a model of the number of arrivals at a facility in a given period of time.
	9.38	In the equation of the regression line represented by Y= 1.24 X + 6.9 if X=2 then Y =?
	{I,M,S,P}	Which of the following is a set equal to the distinct letters of the word "MISSISSIPI"?
	Schwar’s Bayesian Criterion	SBC means_________
	Probability density	Which function provides the value of a function at any particular value of x but does NOT directly give the probability of the random variable?
	95	What percent of data will lie within 2 standard deviation of the mean?
	DJ Patil	He coined the term "data scientist"
	{3,5,6,10,12}	The range in R={ (3,3), (3,6), (5,5),(5,10),(6.12)} is a binary relation in R is
	joint	The sets A= { x/x is a distinct letter in the word "MATHEMATICS"} and B={x/x is a distinct letter in the word "STATISTICS"} , the two sets are
	run time analysis	It is a theoretical classification that estimates and anticipates the increase increase in running time for algorithms.
	Intelligent Reasoning	It is a variety of formal calculation typically deduction.
	Knime	It is a powerful tool that shows the network of data.
	manipulate data efficiently and effectively	What is the focus of ?
	95	What is the value of the mean if a score of 110 is 3 standard deviation above the mean?
	1	A perfect positive correlation coefficient is equal to
	Business Intelligence	It transforms data into actionable intelligence for business purposes.
	R-programming	It is a free software programming language.
	12.25	If the standard deviation of a distribution is 3.5, the variance is
	Java	What programming language is used in Rapid miner?
	Expected	The _______value is the weighted average of the value the random variable may assume.
	x increases y decreases	A negative correlation exists when___________.
	google map	Example of a data product.
	I and iv	Which pair belongs to the same family of models called GLM? i) logistic ii) linear regression iii.) multinomial regression iv)probability
	geometric	The following are continuous distributions EXCEPT
	surrogate	KR as a _________is a substitute for the thing itself.
	Median	The score NOT easily affected by extreme values.
	normal	A bell-shaped distribution that is symmetric about a vertical line.
	critical thinking	According to Hilary Mason which is NOT a skill that a good data scientist must cultivate.
	normal	The most commonly used continuous probability distribution.
	worst case	The function describing the performance of an algorithm is usually an upper bound determined from ______inputs.
	logic	It involves a commitment in viewing the world in terms of individual entities and relations between them.
	analytics	Data is NOT information unless we add_________.
	philosophy	The following provided inspirations of what constitute intelligent reasoning EXCEPT
	A	Which of the matrices is singular?
	poisson and binomial	Two of the most widely used discrete probability distribution.
	5x 8	What is the size of the product of a 5x 6 and a 6x 8 matrices?
	INTERNIST	It sees a set of prototypes in particular to be matched to cases at hand
	regression	Which of the following is a predictive data mining technique?
	Turing machine	An example of an abstract computer.
	71-89	What range of values lie between 3 standard deviations above and below the mean if the mean is 80 and the standard deviation is 3?
	median	The middle-most value in a ranked list of numbers.
	normal	A bell-shaped distribution that is symmetric about a vertical line?
	Pearson r	Which of the following is used as a method for Correlation?
	multinomial logit model	A model that corresponds to the case where the dependent variable has more than two categories.
	cluster analysis	It includes identifying groups of data records
	orange	it is a perfect software for machine learning.
	Big beta notation	The following are large inputs EXCEPT
	5	The value of X in the regression equation Y= 1.24 X + 6.9 if Y=13.1 is
	William Gillason	Who said that "The future is not google-able " ?
	bivariate	Data involving two variables.
	data analysis	The process of inspecting,cleansing,transforming and modelling data with the goal of discovering useful information.
	Have same sizes.	Addition and subtraction of matrices only is possible if two are more matrices.
	{3,5,6}	If R= { (3,3), (3,6), (5,5),(5,10),(6.12)} is a binary relation in R which the domain is
	ontological	KR is a set of __________commitments.
	I and ii	Which pair belongs to the same family of models called GLM? i) logistic ii) linear regression iii.) multinomial regression iv)probability
	Data mining	The goal is to transform raw data into understandable business information.
	Receiver Operating Characteristics	ROC means
	logit model	The most common function used to link probability to explanatory variables.
	PROBIT	The most common functions used to link probability to the explanatory variables are the LOGIT model and ________model.
	95	A survey of 100 consumers said that the price charged for a kilo of rice could be approximated by a normal distribution with a mean of 35 and a standard deviation of 4.How many of them lie between 27 and 43?
	data visualization	It makes complex data more understandable and usable.
	no mode	A data having the same number of occurrence in scores is said to be
	Medium of human expression.	It is a language that we say things about the world.
	datalogy	Earlier name for data science.
	Mode	The number that occurs most frequently is called________.
	52nd	If there are 103 scores the median is equal to the _____ranked score.
	mean-50 s=5	What is the value of the mean and standard deviation in a normal probability density function?
	data scientist	He is someone who asks interesting questions on formal and informal theory.
	dispersion	Another term for variability.
	10	A score of 50 lies 2 standard deviations above a mean of 30.What is the value of the standard deviation?
	studio	It is used for prototyping in Rapid miner.
	time complexity	It relates the length of an algorithm’s input to the number of steps it takes.
	Rapid miner	_____________ is rated as the number one business analytics software.
	Normal	The most widely used continuous probability distribution.
	unstructured	What type of text are processed in Text analytics?
	84	A vegetable distributor knows that during the month of August ,the weights of tomatoes are normally distributed with a mean of 0.61 lb and a standard deviation of 0.15 lb. What percent of the tomatoes weigh less than 0.71 lb?
	2x3	The product of a 2x5 and 5x3 matrices is a ______matrix
	confusion matrix	The classification table that XLSTAT can display
	range	The difference between the highest and lowest value.
	A + B = B+ A	Which of the following is TRUE?
	Higher than the mean	A positive z-score means that the score is
	7	There are how many data mining techniques?
	KR	It is used to enable an entity to determine consequences by thinking rather than acting.
	text mining	Another term for text analytics.
	The correct answers are: Mean, Median, Mode	Which of the following is TRUE when a distribution is normal?
	graph	Which is NOT a basic representation technologies?
	random variable	It is a numerical function of the outcome of a statistical experiment.
	computational complexity theory	is an important part of a broader_____________.
	profile likehood	It does NOT require the assumption that the parameters are normally distributed.
	84	A survey of 100 consumers said that the price charged for a kilo of rice could be approximated by a normal distribution with a mean of 35 and a standard deviation of 4.How many are less than 39?
	data visualization	Refers to using tools of statistics to present data visually.
	Chi-square	Which of the following is a continuous distribution?
	business intelligence	It is used in organization’s strategic and tactical business decision making.
	datafication	The quantification of data into information.
	multinomial logit model	It corresponds to the case where the dependent variable has more than 2 categories.
	classification	Which of the following data mining techniques is predictive?
	velocity	What increases data volume?
	Spearman rho	The method of correlation used for ranked score is ________.
	-14 -2 13 18	3A + B
	Knime	It is popular among financial data analysts.
	Knowledge Representation	What is KR?
	sequence	A special type of function where the domain is a set of consecutive integers.
	λ	Null strings are indicated by
	Run-time analysis	It is a theoretical classification that estimates and anticipates the increase in (or run-
	1.02	Which is NOT a value of r ?
	Hypergeometric	Which of the following is a discrete distribution?
	18	In α =babaa β =a^6b^5bb, what is the length of the concatenation of the two strings?
	square	A matrix that has the same number of rows and columns is called
	number of books	Which is an example of a discrete random variable?
	inference	Any way to get new expressions from old ones.
	0.206	The area of the standard normal curve to the right of z=0.82 is _______.
	{A,C,I,S,T}	If A= { x/x is a distinct letter in the word "MATHEMATICS"} AND B={x/x is a distinct letter in the word "STATISTICS"} then their intersection is
	probability density function	It provides the height or the value of the function at any particular value of x
	150	A vegetable distributor knows that during the month of August ,the weights of tomatoes are normally distributed with a mean of 0.61 lb and a standard deviation of 0.15 lb. How many can be expected to weigh more than 0.31 lb in a shipment of 6000 tomatoes.
	λ	The symbol used to indicate strings with no elements.
	analysis of algorithms	It is a process of finding the computational complexity of algorithms.
	it adheres to the function	Which is NOT a component of KR?

AMA FREE SOURCES

Hanapan ang Blog na Ito

Data Analysis