Đề thi trắc nghiệm kinh tế lượng (Phần 2)

5/5 - (1 bình chọn)

112 câu trắc nghiệm Kinh tế lượng

Tổng hợp 112 câu trắc nghiệm + tự luận môn Kinh tế lượng, nội dung phần Nghiên cứu Marketing (Marketing research) có đáp án và lời giải thích kèm theo (bằng tiếng anh). Nội dung bao gồm 8 chương như sau:

1. Introduction to marketing research
2. Exploratory research design
3. Conclusive research design
4. Sampling
5. Measurement and scaling
6. Questionnaire design
7. Data preparation and preliminary data analysis
8. Report preparation and presentation

Phần 1 gồm nội dung của 4 chương đầu.

1. Introduction to marketing research: Scientific research approach and Problem definition

KTL_003_C1_1: The process of marketing involves all of the following EXCEPT:
○ Product
● Production
○ Pricing
○ Distribution
○ Promotion

KTL_003_C1_2: Problem identification research is undertaken to:
● Help identify problems that are not apparent on the surface and yet exist or may exist in the future.
○ Develop clear, concise marketing segments.
○ Help solve specific research problems.
○ Establish a procedure for development of a primary research plan.

KTL_003_C1_3: Which of the issues listed below would be addressed using problem-solving research?
○ the need to understand market potential
○ the need to understand current cultural trends
○ the need to understand changes in consumer behavior
● the need to determine where to locate retail outlets

KTL_003_C1_4: Every marketing research project is unique in its own sense.
● True
○ False

KTL_003_C1_5: Marketing managers require the information from marketing research for various reasons.
Which of the following is/are the reason(s) for the requirement of that information?
○ More and more companies are facing international competition.
○ Consumers have become very demanding and are asking for newer products and services all the time.
○ Managers are becoming distant from consumers due to layers in organizational hierarchy.
● All of the above.

KTL_003_C1_6: In contrast to marketing researchers, management decision-makers are more focused on:
○ scientific and technical analysis of emerging phenomenon
● market performance
○ proactive research
○ long-term strategic investigation of marketplace

KTL_003_C1_7: A research project can involve both problem identification and problem-solving research.
● True
○ False

KTL_003_C1_8: To convert a management dilemma into a research question what should a manager and researcher focus on:
○ The decision making environment
○ Alternative courses of action
○ Objectives of the decision makers
○ Consequences of alternative actions
○ None of the above
● All of the above

KTL_003_C1_9: Conducting marketing research guarantees success.
○ True
● False

KTL_003_C1_10: Marketing research can assist in the decision making process
● True
○ False

KTL_003_C1_11: Explain problem identification and problem-solving research in details. Are these two types of researches related?

Problem identification research is undertaken to identify problems that are perhaps not apparent on the surface and yet exist or are likely to exist in the future. On the other hand, problem-solving research is undertaken to arrive at a solution to an existing problem.

Problem identification research and problem-solving research compliment each other because once a problem or opportunity has been identified, problem-solving research can be undertaken. Similarly, once a problem solving research has been carried out, a research might new problems emerging out of the results and may require problem identification research. A given marketing research project may combine both types of research. The example of green tea in the UK elaborated on these two aspects in the book chapter.

KTL_003_C1_12: What are the limitations of marketing research?

There are two major limitations of marketing research.
○ It cannot provide decision directly. Marketing research can assist in decision making process as a decision support tool but cannot be used as a decision making tool.
○ It cannot guarantee success. Marketing research is carried out mostly on a sample of respondents who at times may not represent the population at large. Marketing research if conducted in the right manner may assist in better decision making however it cannot guarantee success.

KTL_003_C1_13: Explain in details the process of marketing research.

Most marketing research involves obtaining information from marketplace directly or indirectly and therefore the common ground is in the realm of method and technique.
The scientific marketing research process can therefore be defined in five stages. (1) Problem or opportunity identification; (2) Exploratory research; (3) Hypothesis development; (4) Conclusive research and; (5) Result. Many researchers also break down this process into further components such as explained in the phase-wise marketing research process section of the book chapter.

KTL_003_C1_14: When converting management dilemma into research questions, what issues should be considered and why?

A manager when faced with a dilemma is surrounded by various elements of decision making namely: (1) The decision making environment; (2) Objectives of decision maker; (3) Alternative courses of action and (4) Consequences of alternative actions. If the research question is developed without keeping the above four elements in mind there are all chances that there would a bias in the early stage of the research which will carry itself further in the total process and may lead to wrong conclusion.

2. Exploratory research design

KTL_003_C2_1: Which of these count as data?
○ The number of males and females in a group
○ The number of employees in an organization
○ A tape recorded interview
○ A poster for a brand of coffee
● All of these

KTL_003_C2_2: When the research objective of a study is to gain background information and to clarify the research problems to create hypotheses, it is generally referred to as:
● Exploratory research design
○ Descriptive research design
○ Causal research design
○ Experimental research design
○ All of the above

KTL_003_C2_3: Which of the following is TRUE?
○ Secondary data are more accurate than primary data.
● The researcher should attempt to gather secondary data before initiating a search for primary data.
○ Primary data are gathered by the researcher and secondary data by other researchers.
○ If a researcher obtains secondary data from the party who collected them, he or she is using a secondary source of secondary data.
○ They are all false.

KTL_003_C2_4: A quantitative research study aims to achieve all of the following, EXCEPT:
○ test various types of hypotheses
○ make accurate predictions about relationships between market factors and behaviour
● generate sustainable competitive advantages for an organization
○ gain meaningful insights into the relationships between variables
○ validate the existing relationships between variables

KTL_003_C2_5: Qualitative research techniques perform better for which of the following issue in comparison to quantitative research techniques?
○ Developing generalizable findings
● Gathering rich data
○ Distinguishing small differences
○ High reliability
○ High validity

KTL_003_C2_6: The optimal number of participants for a focus group is:
○ 1-2 members
○ 3-7 members
● 8-12 members
○ 12-20 members
○ 20-50 members

KTL_003_C2_7: For which of the following projects would secondary data collection likely be sufficient in arriving at a conclusion?
○ A bank wants to determine how the bank’s customers feel about the new service they have introduced.
● A fast-food franchisee wants to determine the market potential for a new type of specialty food in a certain area.
○ A department store chain wants to know whether consumers will spend more money if a coffee shop was introduced.
○ A pet food manufacturer wants to determine whether dogs will prefer a new type of dog food.
○ None of the above.

KTL_003_C2_8: The basic rule for data collection process is:
○ Always start by consulting the governmental statistics website
○ Begin with primary data, then supplement if needed with secondary data.
● Begin with secondary data, then proceed if necessary to collect primary data.
○ Always investigate external sources of secondary data first.
○ Design a field experiment to collect primary data.

KTL_003_C2_9: Which of the following are advantages of individual depth interviews?
○ They allow deeper and candid discussion.
○ They eliminate the negatives that group influences have in a focus group.
○ None of the above
● Both of the above (a and b)

KTL_003_C2_10: Which of the following is not a project technique:
● In-depth interview
○ Pictorial construction
○ Word association tests
○ Sentence completion tests
○ Role plays

KTL_003_C2_11: Compare and contrast the exploratory, descriptive, and causal research designs.

The objective of exploratory design is to discover ideas and insights; of descriptive design is to describe market characteristics; of causal design to determine cause and effect or functions.

The characteristics of exploratory design include flexibility, versatility, and that it is often used as the front end of total research design. The characteristics of descriptive design include its preplanned and structured design and that it is marked by the prior formulation of specific hypotheses. The characteristics of causal design include the fact that mediating variables must be controlled for and that one or more independent variables are manipulated.

Methods using exploratory design include expert surveys, pilot surveys, secondary data (which is analyzed qualitatively), and qualitative research. Methods using descriptive design include secondary data (which is analyzed quantitatively), surveys, panels, and observational and other data. Methods using causal design include experiments.

KTL_003_C2_12: What is the major difference between qualitative and quantitative research techniques? Why is qualitative research techniques termed as exploratory research by many?

One of the major aims of qualitative research is to gain preliminary insights into decision problems and opportunities. This technique of data collection focuses on collection of data from a relatively small number of respondents by asking questions and observing behaviour. In qualitative research most questions are open-ended in nature. Advantages of qualitative methods include: economic and timely data collection; rich data; accuracy of recording market behaviour; and preliminary insights. On the other hand, disadvantages of qualitative methods include: lack of generalizability, reliability and validity.

Quantitative research methods, seek to quantify the data and typically apply some statistical analysis. They put heavy emphasize on using formalised standard questions and predetermined response options in questionnaires or surveys administered to large number of respondents. Today, quantitative research is commonly associated with surveys and experiments and is still considered the mainstay of the research industry for collecting marketing data.

In recent years, qualitative research has come to refer to selected research methods used in exploratory research designs. Quantitative research techniques on the other hand are more directly related to descriptive and causal designs than the exploratory design. Therefore, many people use the term qualitative and exploratory interchangeably however a researcher should avoid doing the same.

KTL_003_C2_13: Describe the various types of exploratory research designs.

Exploratory research design involves many qualitative data collection techniques such as in-depth interviews, focus groups and projective techniques. In-depth interviews are one-to-one interviews with respondents while focus group involves a group of 6 – 12 respondents in a congenial setting. Focus groups is one of the most popular qualitative research techniques. Projective techniques involve various psychological testing such as pictorial construction, word association tests, sentence completion tests and role plays. They are used in understanding the hidden associations in a consumer’s mind. The qualitative data collection techniques provide a lot of rich information but at the same time is hard to interpret and involves limitation with regard to generalizability, reliability and validity.

KTL_003_C2_14: What are the advantages of using projective techniques in comparison to focus groups and in-depth interviews?

Projective techniques have a major advantage over focus groups and depth interviews that they may elicit responses that subjects would be unwilling or unable to give if they knew the purpose of the study. At times, in direct questioning, the respondent may intentionally or unintentionally misunderstand, misinterpret, or mislead the researcher. In these cases, projective techniques can increase the validity of responses by disguising the purpose. This is particularly true when the issues to be addressed are personal, sensitive, or subject to strong social norms.

3. Conclusive research design

KTL_003_C3_1: Which of the following methods can be used in administering survey instruments?
○ Personal interview
○ Mall intercept
○ Internet
○ Mail interview
● All of the above
○ None of the above

KTL_003_C3_2: All of the following are advantages of surveys, EXCEPT:
○ Surveys can tap into factors that are not directly observable
● One can accommodate large sample sizes at relatively modest costs
○ Administration of surveys is relatively easy
○ One can make extensive use of probing questions using a survey
○ Survey data can be used with advanced statistical analysis

KTL_003_C3_3: Most conclusive research designs involve qualitative research techniques.
○ True
● False

KTL_003_C3_4: What does CATI stand for in marketing research?
○ Computer anonymized telephone interaction
○ Computing & analysing technical information
○ Computer associated telephone interaction
● Computer assisted telephone interviewing
○ None of the above

KTL_003_C3_5: Which of the following is NOT an advantage of a self-administered survey?
○ Cost per survey
○ Respondent control
○ Interviewer-respondent bias
● Flexibility
○ Anonymity in responses

KTL_003_C3_6: What observation method is the most flexible?
● Personal observation
○ Mechanical observation
○ Audit
○ All of the above

KTL_003_C3_7: The survey method involves a structured questionnaire administered to a sample of a population and designed to elicit specific information from respondents.
● True
○ False

KTL_003_C3_8: Descriptive designs involve mostly experimentation.
○ True
● False

KTL_003_C3_9: Cross-sectional designs and longitudinal designs are at times compared with a photograph and a movie respectively.
● True
○ False

KTL_003_C3_10: Method of observation depends on:
○ Directness of approach
○ Respondent’s awareness of being observed
○ The rigour of information and structure
○ Observation recording method
● All of the above

KTL_003_C3_11: What different types of personal interviewing methods are used in marketing research?

Personal interviewing methods used in marketing research are broadly classified into in-home interviews, executive interviews, mall-intercept interviews and purchaseintercept interviews. In-home interviews are conducted in respondent’s home with a structured question and answer exchange between interviewer and the respondent. As the respondent is in the comfort of their home the likelihood of them answering the questions is higher in comparison. In case of executive interview, the exchange happens in the office of the business executive. These types of interviews are conducted to gather industry related or market related information. Mall-intercept interviews, as the name suggests, are face-to-face personal interviews which take place in a shopping mall. Mall shoppers are stopped and asked for feedback or certain issues. In case of purchase-intercept interviews respondents are stopped and asked for feedback on the product bought.

KTL_003_C3_12: Discuss the difference between cross-sectional and longitudinal research designs.

The cross-sectional design is the most common and most familiar way of conducting marketing research. It involves collection of information from any given sample of population elements only once. The objective of cross-sectional design many times is to establish categories such that classification in one category implies classification in one or more other categories.

A longitudinal design is much more reliable than a cross-sectional design for monitoring changes over time, because it relies less on consumers’ mental capabilities and more frequently monitors events as close to their time of occurrence as feasible. The primary objective of longitudinal design is to monitor change over a period of time. It involves a fixed sample of population elements that is measured repeatedly. The sample remains the same over a period of time, thus providing a series of pictures which, when viewed together, portray a detailed illustration of the situation and changes that are taking place over a period of time.
The major difference between cohort analysis and longitudinal design thus is the sample. While longitudinal design adheres to a single sample, it changes every time the research is conducted in cohort analysis. In simple terms, the same people are studied over time and same variables are measured.

KTL_003_C3_13: Discuss causal designs and experimentation.

Causal research is most appropriate when the research objectives include the need to understand the reasons why certain market phenomena happen as they do. To measure this however, the data must be gathered under controlled conditions – that is, holding constant, or neutralizing the effect of, all variables other than the causation variable (in the case above packaging change). After neutralizing the effects of other variables researchers manipulate the causation variable and measure the change in the effect variable (in the case above supermarket sales). Manipulation of the presumed causal variable and control of other relevant variables are distinct features of causal design.

Experimentation as a technique is generally used when conducting causal research. There are two kinds of experimentation techniques available to researchers namely (a) laboratory experiment and (b) field experiment. A laboratory experiment is one in which a researcher creates a situation with the desired conditions and then manipulates some while controlling other variables. The researcher is consequently able to observe and measure the effect of the manipulation of the independent variables on the dependent variable or variables in a situation in which the impact of other relevant factors is minimized. A field experiment on the other hand is a research study in a realistic or natural situation, although it too, involves the manipulation of one or more independent variables under as carefully controlled conditions as the situation will permit.

Data collected through experimentation can provide much stronger evidence of cause and effect than can data collected through descriptive research. While experimentation is a robust technique to find causation and assist manager in decision making there are several limitation associated with it. These limitation mostly concern with the time involved in experimentation, costs and administration difficulties.

KTL_003_C3_14: Write a brief note on survey methods.

Survey methods tend to be the mainstay of marketing research in general. They tend to involve a structured questionnaire given to respondents and designed to elicit specific information. Respondents are asked variety of questions regarding their feelings, motivations, behaviour, attitudes, intentions, emotions, demographics and such other variables. The questions are asked via direct face to face contact, post, telephone or internet. The responses are recorded in a structured, precise manner.

The survey method is popular for various reasons. One of the major reasons is that data collection is a function of correctly designing and administering the survey instrument (i.e. a questionnaire). This means unlike exploratory design based techniques survey methods rely less on communication, moderation and interpretation skills of the researcher. Survey research allows the researcher to create information for precisely answering who, what, how, where and when questions relating to the marketplace. Furthermore, survey methods have ability to accommodate large sample size and therefore increase generalizability of results. In case of survey methods researcher can easily distinguish small differences. Furthermore, researcher can easily adopt robust advance statistical methods on collected data for gaining results. Such advantages make survey methods quite popular.

While survey methods provide several advantages, there are several limitations also. These limitations stem mostly from instrument development, respondent errors and response bias. Developing accurate survey instruments is a difficult task and at times is time consuming. Furthermore, due to instrument measurement being structured in nature, in-depth and detailed data structures as gathered in exploratory research cannot be collected. One of the major problems with survey methods is to determine whether the respondents are responding truthfully or not. There is little crosschecking and flexibility available in comparison to exploratory designs. There is also a possibility of misinterpretations of data results and employment of inappropriate statistical analysis procedure.

4. Sampling

KTL_003_C4_1: In what of the following situations sampling plays an important role:
○ In identifying, developing, and understanding new marketing concepts that need to be investigated
○ In designing questionnaires
○ In reducing the time and money it will take to conduct a survey
○ In developing scale measurements used to collect primary data
● All of the above

KTL_003_C4_2: We use sampling many times during our daily lives.
● True
○ False

KTL_003_C4_3: The studies which cover all the members of ______________ are called ‘census’.
○ Elements
● Population
○ Sample
○ Sampling frame
○ All of the above

KTL_003_C4_4: A ___________________ is a representation of the elements of the target population.
○ Population
● Sampling frame
○ Sample
○ Element
○ All of the above

KTL_003_C4_5: Non-sampling errors represent any type of bias that is attributable to mistakes in either drawing a sample or demining the sample size.
○ True
● False

KTL_003_C4_6: Which of the following is a not a probability sampling technique
○ Systematic random sampling
○ Cluster sampling
● Quota sampling
○ Stratified sampling

KTL_003_C4_7: In which sampling technique a random number table is employed.
○ Snowball sampling
● Simple random sampling
○ Systematic random sampling
○ Convenience sampling

KTL_003_C4_8: In which technique selection of sample is left entirely to the researcher.
● Convenience sampling
○ Simple random sampling
○ Stratified sampling
○ Cluster sampling

KTL_003_C4_9: Which nonprobability sampling technique is called as the most refined nonprobability technique?
○ Convenience sampling
○ Simple random sampling
○ Judgement sampling
○ Quota sampling
● Snowball sampling

KTL_003_C4_10: In which of the sampling techniques each sampling unit has a known, nonzero chance of selection.
● Probability sampling technique
○ Nonprobability sampling technique

KTL_003_C4_11: When determining the sample size what qualitative and quantitative issues should be taken into consideration by researcher?

The qualitative issues considered may include factors such as:
– Nature of research and expected outcome
– Importance of the decision to organization
– Number of variables being studied
– Sample size in similar studies
– Nature of analysis
– Resource constraints
Various quantitative measures are also considered when determining sample size such as:
– Variability of the population characteristics (greater the variability, larger the sample required)
– Level of confidence desired (higher the confidence desired, larger the sample required);
– Degree of precision desired in estimating population characteristics (more precise the study, larger the sample required).

KTL_003_C4_12: Provide a brief note highlighting major differences between probability and nonprobability sampling techniques?

Probability sampling is more robust in comparison as in this technique each sampling unit has a known, nonzero chance of getting selected in the final sample.

Nonprobability techniques on the other hand, do not use chance selection procedure. Rather, they rely on the personal judgement of the researcher. The results obtained by using probability sampling can be generalized to the target population within a specified margin of error through the use of statistical methods. Put simply, probability sampling allows researchers to judge the reliability and validity of the findings in comparison to the defined target population. In case of nonprobability sampling, the selection of each sampling unit is unknown and therefore, the potential error between the sample and target population cannot be computed. Thus, generalizability of findings generated through nonprobability sampling is limited. While probability sampling techniques are robust in comparison one of the major disadvantages of such techniques is the difficulty in obtaining a complete, current and accurate listing of target population elements.

KTL_003_C4_13: Discuss stratified sampling in details.

Stratified sampling is a probability sampling technique which is distinguished by the two-step procedure it involves. In the first step the population is divided into mutually exclusive and collectively exhaustive sub-populations, which are called strata. In the second step, a simple random sample of elements is chosen independently from each group or strata. This technique is used when there is considerable diversity among the population elements. The major aim of this technique is to reduce cost without lose in precision. There are two types of stratified random sampling; (a) proportionate stratified sampling and (b) disproportionate stratified sampling. In proportionate stratified sampling, the sample size from each stratum is dependent on that stratum’s size relative to the defined target population. Therefore, the larger strata are sampled more heavily using this method as they make up a larger percentage of the target population. On the other hand, in disproportionate stratified sampling, the sample selected from each stratum is independent of that stratum’s proportion of the total defined target population. There are several advantages of stratified sampling including the assurance of representativeness, comparison between strata and understanding of each stratum as well as its unique characteristics. One of the major difficulty however, is to identify the correct stratifying variable.

KTL_003_C4_14: Explain quota sampling and its advantages as well as disadvantages.

Quota sampling restricts the selection of the sample by controlling the number of respondents by one or more criterion. The restriction generally involves quotas regarding respondents’ demographic characteristics (e.g. age, race, income), specific attitudes (e.g. satisfaction level, quality consciousness), or specific behaviours (e.g. frequency of purchase, usage patterns). These quotas are assigned in a way that there remains similarity between quotas and populations with respect to the characteristics of interest. Quota sampling is also viewed as a two-stage restricted judgement sampling. In the first stage restricted categories are built as discussed above and in the second stage respondents are selected on the basis of convenience of judgement of the researcher. This procedure is used quite frequently in marketing research as it is easier to manage in comparison to stratified random or cluster sampling. Quota sampling is often called as the most refined form of nonprobability sampling. It also reduces or eliminates selection bias on the part of field workers which is strongly present in convenience sampling. However, being a nonprobability method it has disadvantages in terms of representativeness and generalizability of findings to a larger population.

Tổng hợp 112 câu trắc nghiệm + tự luận môn Kinh tế lượng, nội dung phần Nghiên cứu Marketing (Marketing research) có đáp án và lời giải thích kèm theo (bằng tiếng anh). Nội dung bao gồm 8 chương như sau:

1. Introduction to marketing research
2. Exploratory research design
3. Conclusive research design
4. Sampling
5. Measurement and scaling
6. Questionnaire design
7. Data preparation and preliminary data analysis
8. Report preparation and presentation

Phần 1 gồm nội dung của 4 chương cuối.

5. Measurement and scaling

KTL_003_C5_1: Most people use measurement in their daily lives.
● True
○ False

KTL_003_C5_2: The idea of assigning numbers can be helpful in:
○ allowing statistical testing
○ facilitating easier communication
● Both a and b
○ None

KTL_003_C5_3: The appropriateness of the raw data being collected depends directly on the scaling technique used by the researcher.
● True
○ False

KTL_003_C5_4: Which of the following scale has assignment property?
○ Nominal
○ Ordinal
○ Interval
○ Ratio
● All of the above
○ None of the above

KTL_003_C5_5: The interval scale possesses all of the below properties, except:
○ Assignment
○ Order
○ Distance
● Origin
○ All of the above
○ None of the above

KTL_003_C5_6: The origin property refers to a numbering system where zero is the displayed or referenced starting point in the set of possible responses.
● True
○ False

KTL_003_C5_7: Which among the following is not comparative scaling technique?
○ Paired comparison
○ Rank order
○ Constant sum scale
○ Q-sort
● Stapel scale

KTL_003_C5_8: Which among the following is not a noncomparative scaling technique?
○ Likert
○ Stapel
○ Semantic differential
● Rank order
○ None of the above

KTL_003_C5_9: Respondent characteristics such as intelligence, education does not have any affect the test score.
○ True
● False

KTL_003_C5_10: Validity refers to scale consistency over a period of time.
○ True
● False

KTL_003_C5_11: Write a brief note on fundamental properties of measurement.

There are four primary fundamental properties of measurement: assignment, order, distance and origin. The assignment property is also referred as description or category property. It refers to the researcher’s employment of unique descriptors, or labels to identify each object within a set. The second measurement scale property, order property, refers to the relative magnitude between the descriptors. The distance property refers to a measurement scheme where exact difference between each of the descriptors is expressed in absolute. The origin property is a measurement scheme wherein exists a unique starting point in a set of scale points. For the most part, the origin property refers to a numbering system where zero is the displayed or referenced starting point in the set of possible responses. Each scaling property builds on the previous one. For example, a scale which includes order property will have assignment property built in. Similarly, a scale which possesses distance property will have assignment and order property both. An origin property based scale will have all assignment, origin and distance properties included in itself.

KTL_003_C5_12: Discuss construct validity and the types of construct validity.

Construct validity addresses the question of what construct or characteristic the scale is, in fact, measuring. When assessing construct validity, the researcher attempts to answer theoretical questions about why the scale works and what deductions can be made concerning the underlying theory. Thus, construct validity requires a sound theory of the nature of the construct being measured and how it relates to other constructs. Construct validity is the most sophisticated and difficult type of validity to establish. Construct validity includes convergent, discriminant, and nomological validity.

Convergent validity is the extent to which the scale correlates positively with other measures of the same construct. It is not necessary that all these measures be obtained by using conventional scaling techniques. Discriminant validity is the extent to which a measure does not correlate with other constructs from which it is supposed to differ.

It involves demonstrating a lack of correlation among differing constructs. Nomological validity is the extent to which the scale correlates in theoretically predicted ways with measures of different but related constructs. A theoretical model is formulated that leads to further deductions, tests, and inferences. Gradually, a nomological net is built in which several constructs are systematically interrelated.

KTL_003_C5_13: Write a brief note about comparative and non-comparative scaling.

The scaling techniques regularly employed in marketing research can be classified into two basic strands: (a) comparative scaling and (b) non-comparative scaling. As the name suggests comparative scaling involves direct comparison of stimulus objects with one another. For example, managers are generally interested in knowing consumer preference regarding their brand in comparison to a competitor’s brand. A researcher can then ask question such as what of the two brands consumer prefers and this would provide the manager a clear idea of what consumer preferences are. There are several techniques which are used in building comparative scale such as paired comparison, rank order, constant sum scale, and q-sort.

While comparative scaling is used for comparison between stimuli, on the other hand, non-comparative scaling involves each stimulus object being scaled independently of the other objects in the stimulus set. The resulting data in non-comparative scale are assumed to be interval or ratio scaled. For example, instead of direct comparison between brands researcher may ask the respondent to rate each brand separately on a scale of 1 – 10 and can evaluate each brand as well as compare the brands also. Noncomparative scaling techniques involve continuous rating scales as well as itemised rating scales. The itemised rating scales are further sub-divided into likert scale, semantic differential scale and stapel scale.

KTL_003_C5_14: What are the various measures for reliability assessment of a scale?

Reliability in research relates to consistency of results over a period of time. A scale is called reliable if it produces consistent results when repeated measurements are made. As the name suggests, in test-retest reliability measurement, same respondents are administered identical sets of scale items at two different times (usually 2 – 4 weeks). The degree of similarity between the measurements (measured through correlation between both measurements) determines the reliability. The higher the correlation between the two measurements, the higher the scale reliability. In measuring alternative forms reliability, two equivalent forms of the scale are constructed and then the same respondents are measured at two different times. Internal consistency reliability is used to assess the reliability of a summated scale where several items are summated to form a total score. In simple words, each item in the scale must measure part of what the scale is developed to measure. Various techniques such as ‘split-half reliability’ or ‘coefficient alpha’ (also known as Cronbach’s alpha) are used to measure internal consistency reliability. In split-half reliability the scale is broken in two halves and the resulting half scores are correlated. High correlation between the two halves shows higher internal consistency. In case of coefficient alpha the average of all possible split-half coefficients is calculated. The value beyond 0.7 suggests acceptable internal reliability.

6. Questionnaire design

KTL_003_C6_1: A questionnaire is a formalized set of questions involving one or more measurement scales designed to collect specified secondary data.
○ True
● False

KTL_003_C6_2: The first step in developing a questionnaire is to specify the information needed in researchable format.
● True
○ False

KTL_003_C6_3: In which of the following interviewing methods most complex question scales can be used easily?
● Personal interviews
○ Telephone interviews
○ Mail interviews
○ Online interviews

KTL_003_C6_4: Unstructured questions are also called as:
○ Close ended
● Open ended
○ Both
○ None

KTL_003_C6_5: Open ended questions are mostly used in:
● Exploratory research
○ Conclusive research
○ Both
○ None

KTL_003_C6_6: What should be avoided when developing a questionnaire?
○ Complex words
○ Ambiguous words
○ Leading questions
○ Generalizations
● All of the above
○ None of the above

KTL_003_C6_7: Double barrelled questions should be avoided in questionnaire development.
● True
○ False

KTL_003_C6_8: The forward and opening questions are highly important in gaining respondents’ trust and making them feel comfortable with the study.
● True
○ False

KTL_003_C6_9: Most socioeconomic and demographic questions are defined as:
○ Identification information
○ Specific information
● Classification information
○ All of the above

KTL_003_C6_10: A questionnaire should not be used in the field survey without being adequately pilot tested.
● True
○ False

KTL_003_C6_11: When selecting the use of a neutral alternative in dichotomous questions what considerations should be kept in mind?

If a neutral alternative is not included, respondents are forced to choose between “yes” and “no” even if they feel indifferent. On the other hand, if a neutral alternative is included, respondents can avoid taking a position on the issue, thereby biasing the results. The following guidelines are offered. If a substantial proportion of the respondents can be expected to be neutral, include a neutral alternative. If the proportion of neutral respondents is expected to be small, avoid the neutral alternative.

KTL_003_C6_12: Describe the importance of pilot testing in questionnaire building.

Once the preliminary questionnaire has been developed a researcher should test the questionnaire on a small sample of respondents to identify and eliminate potential problems. This sampling process is called pilot testing. It is advised that, a questionnaire should not be used in the field survey without being adequately pilot tested. A pilot test provides testing of all aspects of a questionnaire including, content, wording, order, form and layout. The sample respondents selected for the pilot test must be similar to those who will be included in the actual survey in terms of their background characteristics, familiarity with the topic and attitudes and behaviours of interest. An initial personal interview based pilot test is recommended for all types of surveys because the researcher can observe respondents’ attitudes and reactions towards each question. Once the necessary changes have been made using the initial personal interview based pilot test, another pilot test could be conducted for mail, telephone or internet based survey. Most researchers recommend a pilot test sample between 15 and 30 respondents. If the study is very large involving multiple stages, a larger pilot test sample may be required. Finally, the response obtained from the pilot test sample should be coded and analysed. These responses can provide a check on the adequacy of the data obtained in answering the issue at hand.

KTL_003_C6_13: What are the steps involved in questionnaire building?

While there is a debate with regard to questionnaire building process, there is consensus among the research community that the designing process involves some established rules of logic, objectivity and systematic procedures. The generic process of questionnaire building involves following steps.
– Specification of the information needed in researchable format
– Selection of interview method
– Determination of question composition
– Determination of individual question content
– Developing question order, form and layout
– Pilot testing the questionnaire

KTL_003_C6_14: Describe the use of forward, generic and specific information questions in questionnaire development.

The questionnaire can be divided in three main parts generally: forward and opening questions; generic information questions; specific information questions.

The forward and opening questions are highly important in gaining respondents’ trust and making them feel comfortable with the study. It also improves the response rate among the respondent if they find it worthwhile and interesting. Questions pertaining to opinion can give a good start to most questionnaires as everyone likes to give some opinion about issues at hand. At times, when it is necessary to qualify a respondent (i.e. determine if they are part of the defined target population), opening questions can act as qualification questions.

Generic information questions are divided into two main areas: classification information questions and identification information questions. Most socioeconomic and demographic questions (age, gender, income group, family size and so on) provide classification information. On the other hand, respondent name, address, and other contact information provide identification information. It is advisable to collect classification information before identification information as most respondents do not like their personal information collected by researchers and this process may alienate the respondent from the interview.

The specific information questions are questions directly associated with the research objectives. They mostly involve various scales and are complex in nature. This type of questions should be asked later in the questionnaire after the rapport has been established between the researcher and the respondent. Most researchers agree that it is good to start with forward and opening questions followed progressively by specific information question and concluding with classification and identification information questions.

7. Data preparation and preliminary data analysis

KTL_003_C7_1: Most market research studies can be solved only by collecting secondary data.
○ True
● False

KTL_003_C7_2: Which of the following steps is not involved in fieldwork?
○ Selection of fieldworkers
○ Training of fieldworkers
○ Supervision of fieldworkers
○ Evaluation of fieldworkers
○ All of the above
● None of the above

KTL_003_C7_3: Probing helps in motivating the respondent and helps focus on a specific issue.
● True
○ False

KTL_003_C7_4: Which of the following is not an appropriate probing technique?
○ Repeating the question
○ Repeating the respondents’ reply
● Forcing the respondent to remember
○ Eliciting clarification
○ Using objective/neutral questions or comments

KTL_003_C7_5: One of the major editing problem concerns with faking of an interview.
● True
○ False

KTL_003_C7_6: How can a researcher avoid and cross-check for fake interviews?
○ Use complex scales
○ Use dichotomous questions
○ Use only close ended questions
● Use few open-ended questions

KTL_003_C7_7: What types of questions are relatively hard to code?
○ Multiple choice questions
○ Dichotomous questions
● Open-ended questions
○ Likert scale based questions

KTL_003_C7_8: Data cleaning involves which of the following.
○ Substituting missing value with a neutral value
○ Substituting an imputed response by following a pattern of respondent’s other responses
○ Casewise deletion
○ Pairwise deletion
● All of the above
○ None of the above

KTL_003_C7_9: Categorical variables involve what of the following scales?
● Nominal and ordinal
○ Nominal and interval
○ Nominal and ratio
○ Ordinal and ratio
○ Ordinal and interval
○ Interval and ratio

KTL_003_C7_10: Categorical variables involve what of the following scales?
○ Nominal and ordinal
○ Nominal and interval
○ Nominal and ratio
○ Ordinal and ratio
● Ordinal and interval
○ Interval and ratio

KTL_003_C7_11: Discuss data cleaning and its importance in preliminary data analysis.

Data cleaning focuses on error detection and consistency checks as well as treatment of missing responses. The first step in the data cleaning process is to check each variable for data that are out of the range or as otherwise called logically inconsistent data. Such data must be corrected as they can hamper the overall analysis process.

Most advance statistical packages provide an output relating to such inconsistent data. Inconsistent data must be closely examined as sometimes they might not be inconsistent and be representing legitimate response.
In most surveys, it happens so that respondent has either provided ambiguous response or the response has been improperly recorded. In such cases, missing value analysis is conducted for cleaning the data. If the proportion of missing values is more than 10%, it poses greater problems. There are four options for treating missing values: (a) substituting missing value with a neutral value (generally mean value for the variable); (b) substituting an imputed response by following a pattern of respondent’s other responses; (c) casewise deletion, in which respondents with any missing responses are discarded from the analysis and (d) pairwise deletion, wherein only the respondents with complete responses for that specific variable are included. The different procedures for data cleaning may yield different results and therefore, researcher should take utmost care when cleaning the data. The data cleaning should be kept at a minimum if possible.

KTL_003_C7_12: Explain data editing and coding process in details.

The usual first step in data preparation is to edit the raw data collected through the questionnaire. Editing detects errors and omissions, corrects them where possible, and certifies that minimum data quality standards have been achieved. The purpose of editing is to generate data which is: accurate; consistent with intent of the question and other information in the survey; uniformly entered; complete; and arranged to simplify coding and tabulation.

Sometimes it becomes obvious that an entry in the questionnaire is incorrect or entered in the wrong place. Such errors could have occurred in interpretation or recording. When responses are inappropriate or missing, the researcher has three choices:

(a) Researcher can sometimes detect the proper answer by reviewing the other information in the schedule. This practice, however, should be limited to those few cases where it is obvious what the correct answer is.

(b) Researcher can contact the respondent for correct information, if the identification information has been collected as well as if time and budget allow.

(c) Researcher strike out the answer if it is clearly inappropriate. Here an editing entry of ‘no answer’ or ‘unknown’ is called for. This procedure, however, is not very useful if your sample size is small, as striking out an answer generates a missing value and often means that the observation cannot be used in the analyses that contain this variable.

One of the major editing problem concerns with faking of an interview. Such fake interviews are hard to spot till they come to editing stage and if the interview contains only tick boxes it becomes highly difficult to spot such fraudulent data. One of the best ways to tackle the fraudulent interviews is to add a few open-ended questions within the questionnaire. These are the most difficult to fake. Distinctive response patterns in other questions will often emerge if faking is occurring. To uncover this, the editor must analyse the instruments used by each interviewer.

Coding involves assigning numbers or other symbols to answers so the responses can be grouped into a limited number of classes or categories. Specifically, coding entails the assignment of numerical values to each individual response for each question within the survey. The classifying of data into limited categories sacrifices some data detail but is necessary for efficient analysis. Instead of requesting the word male or female in response to a question that asks for the identification of one’s gender, we could use the codes ‘M’ or ‘F’. Normally this variable would be coded 1 for male and 2 for female or 0 and 1. Similarly, a Likert scale can be coded as: 1 = strongly disagree; 2 = disagree; 3 = neither agree nor disagree; 4 = agree and 5 = strongly agree. Coding the data in this format helps the overall analysis process as most statistical software understand the numbers easily. Coding helps the researcher to reduce several thousand replies to a few categories containing the critical information needed for analysis. In coding, categories are the partitioning of a set; and categorization is the process of using rules to partition a body of data.

KTL_003_C7_13: Why should a researcher do a normality and outliers assessment before hypotheses testing?

To conduct many advance statistical techniques, researchers have to assume that the data provided is normal (means it is symmetrical on a bell curve) and free of outliers. In simple terms, if the data was plotted on a bell curve, the highest number of data points will be available in the middle and the data points will reduce on either side in a proportional fashion as we move away from the middle. Normality and outliers analysis provides clarity with regard to fundamental assumption of many advance statistical techniques. The skewness and kurtosis analysis can provide some idea with regard to the normality. Positive skewness values suggest clustering of data points on the low values (left hand side of the bell curve) and negative skewness values suggest clustering of datapoints on the high values (right hand side of the bell curve).

Positive kurtosis values suggest that the datapoints have peaked (gathered in centre) with long thin tails. Kurtosis values below 0 suggest that the distribution of datapoints is relatively flat (i.e. too many cases in the extreme). In a way, without normality and outliers assessment researcher may get false results which might lead to wrong conclusion and decision making.

KTL_003_C7_14: List the steps for generic hypothesis testing procedure.

Testing for statistical significance follows a relatively well-defined pattern, although authors differ in the number and sequence of steps. The generic process is described below.
1. Formulate the hypothesis
2. Select an appropriate test
3. Select desired level of significance
4. Compute the calculated difference value
5. Obtain the critical value
6. Compare the calculated and critical values
7. Marketing research interpretation

8. Report preparation and presentation

KTL_003_C8_1: Marketing research report is the bridge between researcher and manager with regard to the research findings.
● True
○ False

KTL_003_C8_2: A project can still be called successful, even if the research results are not effectively communicated using the research report.
○ True
● False

KTL_003_C8_3: Many times managers judge the research by the quality of the report.
● True
○ False

KTL_003_C8_4: While writing the report, researcher should empathize with how the manager will be reading and interpreting the report?
● True
○ False

KTL_003_C8_5: Which of the following must be kept in mind when writing a marketing research report?
○ Empathizing skills
○ Structure and logical arguments
○ Objectivity
○ Professional presentation
● All of the above

KTL_003_C8_6: Many consider executive summary as the soul of the research report?
● True
○ False

KTL_003_C8_7: Executive summary should involve all of the following, except:
○ Why and how the research was carried out
● What was done to manage fieldworkers
○ What was found
○ What can be interpreted and acted upon by the manager

KTL_003_C8_8: Which of the following sections in report should provide background information to the research?
○ Research methodology
○ Results
○ Conclusion
● Introduction

KTL_003_C8_9: Pilot testing should be discussed in which of the following sections of the report.
○ Introduction
○ Research methodology
● Results
○ Conclusion

KTL_003_C8_10: Researcher should explain any jargons used in the report succinctly.
● True
○ False

KTL_003_C8_11: Discuss the importance of marketing research report in the overall marketing research process.

Marketing research report is the bridge between researcher and manager with regard to the research findings. Even if the research project is carried out with most meticulous design and methodology, if the research results are not effectively communicated using the research report to the manager, the research project may not be a success. This is because the research results will not help in achieving the major aim of any research project, which is to support the decision making process. Research report is a tangible output of the research project and not only helps in decision making but also provides documentary evidence and serves as a historical record of the project. Many a times, managers are only involved in looking at the research report (i.e. oral presentation and written report) and therefore most times the research project is judged by the quality of the research report. This has direct association with the relationship between the researcher and manager. All of the above reasons suggest the importance of marketing research report.

KTL_003_C8_12: What are the key issues to keep in mind when writing research reports?

Before communicating the results of the project to the manager, the researcher should keep several issues in mind for effective communication. The first and foremost rule for writing the report is to empathize. The researcher must keep in mind that the manager who is going to read and utilize the findings of the research project might not be as technically knowledgeable with statistical techniques or at times with the methodology. Furthermore, the manager will be more interested in knowing how results can be used for decision making rather than how they have been derived. Therefore, the jargons and technical terms should be kept at minimum. If the jargons cannot be avoided, then researcher should provide a brief explanation for the manager to understand it.

The second rule researcher should keep in mind is related to the structure of the report. The report should be logically structured and easy to follow. The manager should easily be able to grasp the inherent linkages and connections within the report. The write up should be succinct and to the point. A clear and uniform pattern should be employed. One of the best ways to check weather the structure of the report is sound or not, the report should be critically looked at by some of the research team members.

Furthermore, researcher must make sure that the scientific rigour and objectivity is not lost when presenting the research project findings. At times, because of the heavy involvement of researcher in the overall research process, it is possible that there is a loss of objectivity. Therefore, researcher should keep a tab on the aspects of objectivity of the overall report. Many times managers do not like to see the results which oppose their judgemental beliefs however the researcher must have the courage to present the findings without any slant to conform to the expectations and beliefs of the managers.

A professionally developed report is always well received as it makes the important first impression in manager’s mind. It is therefore very important for researcher to focus on the presentation of the report. The other important aspect is the use of figures, graphs and tables. There is an old saying that, ‘a picture is worth 1000 words’ and that is quite true when reporting the results of a research project. Use of figures, graphs and tables can help in interpretations as well as greatly enhance the look and feel of the report which in turn can augment the reader engagement.
If the report is prepared keeping in mind the above stated key issues, the overall credibility of the research report as well as of the researcher can be greatly enhanced.

KTL_003_C8_13: List the components of a generic marketing research report.

Following is the list of components for a generic marketing research report.
1. Title page
2. Table of contents
3. Executive summary
○ Research objectives
○ Brief discussion on methodology
○ Major findings
○ Conclusion
○ Recommendations
4. Introduction: Problem definition
5. Research design
○ Type of design used
○ Data collection
○ Scaling techniques
○ Questionnaire development and pilot testing
○ Sampling
○ Fieldwork
6. Data analysis and findings
○ Analysis techniques employed
○ Results
7. Conclusion and recommendation
8. Limitations and future directions
9. Appendices
○ Questionnaire and forms
○ Statistical output

KTL_003_C8_14: Write a brief note on report presentation.

The presentation has become an integral part of most marketing research projects. Most managers are finding it hard to read the entire report and so prefer the researcher to present the report in an oral presentation. Furthermore, the presentation provides an opportunity for the research and management team to interact the issues of concern and in that way it becomes an important exercise.

For any presentation, the most important aspect is preparation. Researcher should first develop an outline of the presentation keeping the audience in mind. Once the outline is developed, the researcher should focus on the content management and decide as to what is relevant and important and what is not. Use of various audio-visual aids as well as other materials such as chalkboards or flipcharts should be planned out in advance. While audio-visual presentation adds to the overall engagement, chalkboards and flipcharts provide flexibility in presentation.

The rules regarding what to do and what not to do when writing reports also apply to the presentation and researcher must keep in mind that the presentation is being done for the managers to grasp the results. Researcher must remember that the research was conducted for assistance in decision making and was not a statistical exercise. Therefore, the focus of the presentation should be on how the research can help managers in making a better informed decision.

93 câu trắc nghiệm Kinh tế lượng

Tổng hợp 93 câu trắc nghiệm Kinh tế lượng cơ bản trong tài chính bằng tiếng anh (có đáp án kèm theo). Nội dung được phân thành 9 chương, được chia làm 2 phần. Các câu hỏi trắc nghiệm phần 1 bao gồm:

KTL_002_C1_1: The numerical score assigned to the credit rating of a bond is best described as what type of number?

○ Continuous
○ Cardinal
● Ordinal
○ Nominal

KTL_002_C1_2: Suppose that we wanted to sum the 2007 returns on ten shares to calculate the return on a portfolio over that year. What method of calculating the individual stock returns would enable us to do this?
● Simple
○ Continuously compounded
○ Neither approach would allow us to do this validly
○ Either approach could be used and they would both give the same portfolio return

KTL_002_C1_3: Consider a bivariate regression model with coefficient standard errors calculated using the usual formulae. Which of the following statements is/are correct regarding the standard error estimator for the slope coefficient?
(i) It varies positively with the square root of the residual variance (s)
(ii) It varies positively with the spread of X about its mean value
(iii) It varies positively with the spread of X about zero
(iv) It varies positively with the sample size T

● (i) only
○ (i) and (iv) only
○ (i), (ii) and (iv) only
○ (i), (ii), (iii) and (iv).

KTL_002_C1_4: In a time series regression of the excess return of a mutual fund on a constant and the excess return on a market index, which of the following statements should be true for the fund manager to be considered to have “beaten the market” in a statistical sense?
● The estimate for αα should be positive and statistically significant
○ The estimate for αα should be positive and statistically significantly greater than the risk-free rate of return
○ The estimate for αα should be positive and statistically significant
○ The estimate for αα should be negative and statistically significant.

KTL_002_C1_5: What result is proved by the Gauss-Markov theorem?
○ That OLS gives unbiased coefficient estimates
○ That OLS gives minimum variance coefficient estimates
● That OLS gives minimum variance coefficient estimates only among the class of linear unbiased estimators
○ That OLS ensures that the errors are distributed normally

KTL_002_C1_6: The type I error associated with testing a hypothesis is equal to
○ One minus the type II error
○ The confidence level
● The size of the test
○ The size of the sample

KTL_002_C1_7: Which of the following is a correct interpretation of a “95% confidence interval” for a regression parameter?
● We are 95% sure that the interval contains the true value of the parameter
○ We are 95% sure that our estimate of the coefficient is correct
○ We are 95% sure that the interval contains our estimate of the coefficient
○ In repeated samples, we would derive the same estimate for the coefficient 95% of the time

KTL_002_C1_8: Which of the following statements is correct concerning the conditions required for OLS to be a usable estimation technique?
● The model must be linear in the parameters
○ The model must be linear in the variables
○ The model must be linear in the variables and the parameters
○ The model must be linear in the residuals.

KTL_002_C1_9: Which of the following is NOT a good reason for including a disturbance term in a regression equation?
○ It captures omitted determinants of the dependent variable
● To allow for the non-zero mean of the dependent variable
○ To allow for errors in the measurement of the dependent variable
○ To allow for random influences on the dependent variable

KTL_002_C1_10: Which of the following is NOT correct with regard to the p-value attached to a test statistic?
● p-values can only be used for two-sided tests
○ It is the marginal significance level where we would be indifferent between rejecting and not rejecting the null hypothesis
○ It is the exact significance level for the test
○ Given the p-value, we can make inferences without referring to statistical tables

KTL_002_C1_11: Which one of the following is NOT an assumption of the classical linear regression model?
○ The explanatory variables are uncorrelated with the error terms.
○ The disturbance terms have zero mean
● The dependent variable is not correlated with the disturbance terms
○ The disturbance terms are independent of one another.

KTL_002_C1_12: Which of the following is the most accurate definition of the term “the OLS estimator”?
○ It comprises the numerical values obtained from OLS estimation
● It is a formula that, when applied to the data, will yield the parameter estimates
○ It is equivalent to the term “the OLS estimate”
○ It is a collection of all of the data used to estimate a linear regression model.

KTL_002_C1_13: Two researchers have identical models, data, coefficients and standard error estimates. They test the same hypothesis using a two-sided alternative, but researcher 1 uses a 5% size of test while researcher 2 uses a 10% test. Which one of the following statements is correct?
○ Researcher 2 will use a larger critical value from the t-tables
● Researcher 2 will have a higher probability of type I error
○ Researcher 1 will be more likely to reject the null hypothesis
○ Both researchers will always reach the same conclusion.

KTL_002_C1_14: Consider an increase in the size of the test used to examine a hypothesis from 5% to 10%. Which one of the following would be an implication?
● The probability of a Type I error is increased
○ The probability of a Type II error is increased
○ The rejection criterion has become more strict
○ The null hypothesis will be rejected less often.

KTL_002_C1_15: What is the relationship, if any, between the normal and t-distributions?
○ A t-distribution with zero degrees of freedom is a normal
○ A t-distribution with one degree of freedom is a normal
● A t-distribution with infinite degrees of freedom is a normal
○ There is no relationship between the two distributions.

KTL_002_C3_1: Consider a standard normally distributed variable, a t-distributed variable with d degrees of freedom, and an F-distributed variable with (1, d) degrees of freedom. Which of the following statements is FALSE?
○ The standard normal is a special case of the t-distribution, the square of which is a special case of the F-distribution.
● Since the three distributions are related, the 5% critical values from each will be the same.
○ Asymptotically, a given test conducted using any of the three distributions will lead to the same conclusion.
○ The normal and t- distributions are symmetric about zero while the F- takes only positive values.

KTL_002_C3_2: If our regression equation is Y=Xβ+UY=Xβ+U, where we have T observations and k regressors, what will be the dimension of ^ββ^ using the standard matrix notation
○ T x k
○ T x 1
● k x 1
○ k x k

Question 3 refers to the following regression estimated on 64 observations:

KTL_002_C3_3: Which of the following null hypotheses could we test using an F-test?
(i) β2 = 0
(ii) β2 = 1 and β3 + β4 = 1
(iii) β3β4 = 1
(iv) β2 – β3 – β4 = 1

○ (i) and (ii) only
○ (ii) and (iv) only
○ (i), (ii), (iii) and (iv)
● (i), (ii), and (iv) only

For question 4, you are given the following data

be approximately
○ 19.48
○ 2.76
● 2.37
○ 3.11

KTL_002_C3_6: What is the relationship, if any, between t-distributed and F-distributed random variables?
○ A t-variate with z degrees of freedom is also an F(1, z)
● The square of a t-variate with z degrees of freedom is also an F(1, z)
○ A t-variate with z degrees of freedom is also an F(z, 1)
○ There is no relationship between the two distributions.

KTL_002_C3_7: Which one of the following statements must hold for EVERY CASE concerning the residual sums of squares for the restricted and unrestricted regressions?

KTL_002_C3_8: Which one of the following is the most appropriate as a definition of R2R2 in the context that the term is usually used?
○ It is the proportion of the total variability of y that is explained by the model
● It is the proportion of the total variability of y about its mean value that is explained by the model
○ It is the correlation between the fitted values and the residuals
○ It is the correlation between the fitted values and the mean.

KTL_002_C3_9: Suppose that the value of R2R2 for an estimated regression model is exactly one. Which of the following are true?
(i) All of the data points must lie exactly on the line
(ii) All of the residuals must be zero
(iii) All of the variability of y about is mean have has been explained by the model
(iv) The fitted line will be horizontal with respect to all of the explanatory variables

○ (ii) and (iv) only
○ (i) and (iii) only
● (i), (ii), and (iii) only
○ (i), (ii), (iii), and (iv)

KTL_002_C3_10: Consider the following two regressions

Which of the following statements are true?
(i) The RSS will be the same for the two models
(ii) The R2R2 will be the same for the two models
(iii) The adjusted R2R2 will be different for the two models
(iv) The regression F-test will be the same for the two models

○ (ii) and (iv) only
● (i) and (iii) only
○ (i), (ii), and (iii) only
○ (i), (ii), (iii), and (iv).

KTL_002_C3_11: Which of the following are often considered disadvantages of the use of adjusted R2R2 as a variable addition / variable deletion rule?
(i) Adjusted R2R2 always rises as more variables are added
(ii) Adjusted R2R2 often leads to large models with many marginally significant or marginally insignificant variables
(iii) Adjusted R2R2 cannot be compared for models with different explanatory variables
(iv) Adjusted R2R2 cannot be compared for models with different explained variables.

● (ii) and (iv) only
○ (i) and (iii) only
○ (i), (ii), and (iii) only
○ (i), (ii), (iii), and (iv).

KTL_002_C4_1: A researcher conducts a Breusch-Godfrey test for autocorrelation using 3 lags of the residuals in the auxiliary regression. The original regression contained 5 regressors including a constant term, and was estimated using 105 observations. What is the critical value using a 5% significance level for the LM test based on T R2R2?
○ 1.99
○ 2.70
● 7.81
○ 8.56.

KTL_002_C4_2: Which of the following would NOT be a potential remedy for the problem of multicollinearity between regressors?
○ Removing one of the explanatory variables
● Transforming the data into logarithms
○ Transforming two of the explanatory variables into ratios
○ Collecting higher frequency data on all of the variables

KTL_002_C4_3: Which of the following conditions must be fulfilled for the Durbin Watson test to be valid?
(i) The regression includes a constant term
(ii) The regressors are non-stochastic
(iii) There are no lags of the dependent variable in the regression
(iv) There are no lags of the independent variables in the regression

● (i), (ii) and (iii) only
○ (i) and (ii) only
○ (i), (ii), (iii) and (iv)
○ (i), (ii), and (iv) only

KTL_002_C4_4: If the residuals of a regression on a large sample are found to be heteroscedastic which of the following might be a likely consequence?
(i) The coefficient estimates are biased
(ii) The standard error estimates for the slope coefficients may be too small
(iii) Statistical inferences may be wrong

○ (i) only
● (ii) and (iii) only
○ (i), (ii) and (iii)
○ (i) and (ii) only

KTL_002_C4_5: The value of the Durbin Watson test statistic in a regression with 4 regressors (including the constant term) estimated on 100 observations is 3.6. What might we suggest from this?
○ The residuals are positively autocorrelated
● The residuals are negatively autocorrelated
○ There is no autocorrelation in the residuals
○ The test statistic has fallen in the intermediate region

KTL_002_C4_6: Which of the following is NOT a good reason for including lagged variables in a regression?
○ Slow response of the dependent variable to changes in the independent variables
○ Over-reactions of the dependent variables
○ The dependent variable is a centred moving average of the past 4 values of the series
● The residuals of the model appear to be non-normal

KTL_002_C4_7: What is the long run solution to the following dynamic econometric model?

\begin{aligned} &\Delta Y_{t}=\beta_{1}+\beta_{2} \Delta X_{2 t}+\beta_{3} \Delta X_{3 t}+U_{t} \ &O Y_{t}=\beta_{1}+\beta_{2} X_{2}+\beta_{3} X_{3} \ &\bigcirc Y_{t}=\beta_{1}+\beta_{2} X_{2 t}+\beta_{3} X_{3 t} \ &Y_{t}=-\frac{\beta_{2}}{\beta_{1}} X_{2}-\frac{\beta_{3}}{\beta_{1}} X_{3} \end{aligned}

● There is no long run solution to this equation

KTL_002_C4_8: Which of the following would you expect to be a problem associated with adding lagged values of the dependent variable into a regression equation?
● The assumption that the regressors are non-stochastic is violated
○ A model with many lags may lead to residual non-normality
○ Adding lags may induce multicollinearity with current values of variables
○ The standard errors of the coefficients will fall as a result of adding more explanatory variables

KTL_002_C4_9: A normal distribution has coefficients of skewness and excess kurtosis which are respectively
● 0 and 0
○ 0 and 3
○ 3 and 0
○ Will vary from one normal distribution to another

KTL_002_C4_10: Which of the following would probably NOT be a potential “cure” for non-normal residuals?
● Transforming two explanatory variables into a ratio
○ Removing large positive residuals
○ Using a procedure for estimation and inference which did not assume normality
○ Removing large negative residuals

KTL_002_C4_11: What would be the consequences for the OLS estimator if autocorrelation is present in a regression model but ignored?
○ It will be biased
○ It will be inconsistent
● It will be inefficient
○ All of a, b and c will be true.

KTL_002_C4_12: If OLS is used in the presence of heteroscedasticity, which of the following will be likely consequences?
(i) Coefficient estimates may be misleading
(ii) Hypothesis tests could reach the wrong conclusions
(iii) Forecasts made from the model could be biased
(iv) Standard errors may inappropriate

● (ii) and (iv) only
○ (i) and (iii) only
○ (i), (ii), and (iii) only
○ (i), (ii), (iii), and (iv).

KTL_002_C4_13: If a residual series is negatively autocorrelated, which one of the following is the most likely value of the Durbin Watson statistic?
○ Close to zero
○ Close to two
● Close to four
○ Close to one.

KTL_002_C4_14: If the residuals of a model containing lags of the dependent variable are autocorrelated, which one of the following could this lead to?
○ Biased but consistent coefficient estimates
● Biased and inconsistent coefficient estimates
○ Unbiased but inconsistent coefficient estimates
○ Unbiased and consistent but inefficient coefficient estimates.

KTL_002_C4_15: Which one of the following is NOT a symptom of near multicollinearity?
○ The R2R2 value is high
○ The regression results change substantively when one particular variable is deleted
● Confidence intervals on parameter estimates are narrow
○ Individual parameter estimates are insignificant

KTL_002_C4_16: Which one of the following would be the most appropriate auxiliary regression for a Ramsey RESET test of functional form?

\begin{aligned} &\bullet y_{t}=\alpha_{0}+\alpha_{1} \hat{y}<em>{t}^{2}+v</em>{t}\ &\bigcirc y_{t}^{2}=\alpha_{0}+\alpha_{1} x_{2 t}+\alpha_{2} x_{3 t}+\alpha_{4} x_{2 t}^{2}+\alpha_{5} x_{3 t}^{2}+\alpha_{6} x_{2 t} x_{3 t}+v_{t}\ &\mathrm{O} \hat{u}<em>{t}^{2}=\alpha</em>{0}+\alpha_{1} \hat{y}<em>{t}^{2}+v</em>{t}\ &\bigcirc u_{t}=\alpha_{0}+\alpha_{1} x_{2 t}+\alpha_{2} x_{3 t}+\alpha_{4} x_{2 t}^{2}+\alpha_{5} x_{3 t}^{2}+\alpha_{6} x_{2 t} x_{3 t}+v_{t} \end{aligned}

KTL_002_C4_17: If a regression equation contains an irrelevant variable, the parameter estimates will be
● Consistent and unbiased but inefficient
○ Consistent and asymptotically efficient but biased
○ Inconsistent
○ Consistent, unbiased and efficient.

KTL_002_C4_18: Put the following steps of the model-building process in the order in which it would be statistically most appropriate to do them:
(i) Estimate model
(ii) Conduct hypothesis tests on coefficients
(iii) Remove irrelevant variables
(iv) Conduct diagnostic tests on the model residuals

○ (i) then (ii) then (iii) then (iv)
○ (i) then (iv) then (ii) then (iii)
● (i) then (iv) then (iii) then (ii)
○ (i) then (iii) then (ii) then (iv).

Tổng hợp 93 câu trắc nghiệm Kinh tế lượng cơ bản trong tài chính bằng tiếng anh (có đáp án kèm theo). Nội dung được phân thành 9 chương, được chia làm 2 phần. Các câu hỏi trắc nghiệm phần 2 bao gồm:

KTL_002_C5_1: Consider the following model estimated for a time series

where εtεt is a zero mean error process. What is the (unconditional) mean of the series, ytyt?

● 0.6
○ 0.3
○ 0.0
○ 0.4

KTL_002_C5_2: Consider the following single exponential smoothing model: St=αXt+(1–α)St–1
You are given the following data: ^α=0.1,Xt=0.5,St–1=0.2

If we believe that the true DGP can be approximated by the exponential smoothing model, what would be an appropriate 2-step ahead forecast for X? (i.e. a forecast of Xt+2Xt+2 made at time t)

○ 0.2
● 0.23
○ 0.5
○ There is insufficient information given in the question to form more than a one step ahead forecast.

KTL_002_C5_3: Consider the following MA(3) process: yt=0.1+0.4ut–1+0.2ut–2–0.1ut–3+ut
What is the optimal forecast for yt, 3 steps into the future (i.e. for time t+2 if all information until time t-1 is available), if you have the following data? ut–1 = 0.3; ut–2 = -0.6; ut–3 = -0.3

○ 0.4
○ 0.0
● 0.07
○ –0.1

KTL_002_C5_4: Which of the following sets of characteristics would usually best describe an autoregressive process of order 3 (i.e. an AR(3))?

● A slowly decaying acf, and a pacf with 3 significant spikes
○ A slowly decaying pacf and an acf with 3 significant spikes
○ A slowly decaying acf and pacf
○ An acf and a pacf with 3 significant spikes

KTL_002_C5_5: A process, xt, which has a constant mean and variance, and zero autocovariance for all non-zero lags is best described as

● A white noise process
○ A covariance stationary process
○ An autocorrelated process
○ A moving average process

KTL_002_C5_6: Which of the following conditions must hold for the autoregressive part of an ARMA model to be stationary?

● All roots of the characteristic equation must lie outside the unit circle
○ All roots of the characteristic equation must lie inside the unit circle
○ All roots must be smaller than unity
○ At least one of the roots must be bigger than one in absolute value.

KTL_002_C5_7: Which of the following statements are true concerning time-series forecasting?
(i) All time-series forecasting methods are essentially extrapolative.
(ii) Forecasting models are prone to perform poorly following a structural break in a series.
(iii) Forecasting accuracy often declines with prediction horizon.
(iv) The mean squared errors of forecasts are usually very highly correlated with the profitability of employing those forecasts in a trading strategy.

○ (i), (ii), (iii), and (iv)
● (i), (ii) and (iii) only
○ (ii), (iii) only
○ (ii) and (iv) only

KTL_002_C5_8: If a series, yt, follows a random walk (with no drift), what is the optimal 1-step ahead forecast for y?
● The current value of y.
○ Zero.
○ The historical unweighted average of y.
○ An exponentially weighted average of previous values of y.

KTL_002_C5_9: Consider a series that follows an MA(1) with zero mean and a moving average coefficient of 0.4. What is the value of the autocorrelation function at lag 1?
○ 0.4
○ 1
● 0.34
○ It is not possible to determine the value of the autocovariances without knowing the disturbance variance.

KTL_002_C5_10: Which of the following statements are true?
(i) An MA(q) can be expressed as an AR(infinity) if it is invertible
(ii) An AR(p) can be written as an MA(infinity) if it is stationary
(iii) The (unconditional) mean of an ARMA process will depend only on the intercept and on the AR coefficients and not on the MA coefficients
(iv) A random walk series will have zero pacf except at lag 1

○ (ii) and (iv) only
○ (i) and (iii) only
○ (i), (ii), and (iii) only
● (i), (ii), (iii), and (iv).

KTL_002_C5_11: Consider the following picture and suggest the model from the following list that best characterises the process:

○ An AR(1)
○ An AR(2)
● An ARMA(1,1)
○ An MA(3)

The acf is clearly declining very slowly in this case, which is consistent with their being an autoregressive part to the appropriate model. The pacf is clearly significant for lags one and two, but the question is does it them become insignificant for lags 2 and 4, indicating an AR(2) process, or does it remain significant, which would be more consistent with a mixed ARMA process? Well, given the huge size of the sample that gave rise to this acf and pacf, even a pacf value of 0.001 would still be statistically significant. Thus an ARMA process is the most likely candidate, although note that it would not be possible to tell from the acf and pacf which model from the ARMA family was more appropriate. The DGP for the data that generated this plot was yt=0.9yt–1–0.3ut–1+ut.

KTL_002_C5_12: Which of the following models can be estimated using ordinary least squares?
(i) An AR(1)
(ii) An ARMA(2,0)
(iii) An MA(1)
(iv) An ARMA(1,1)

○ (i) only
● (i) and (ii) only
○ (i), (ii), and (iii) only
○ (i), (ii), (iii), and (iv).

KTL_002_C5_13: If a series, y, is described as “mean-reverting”, which model from the following list is likely to produce the best long-term forecasts for that series y?
○ A random walk
● The long term mean of the series
○ A model from the ARMA family
○ A random walk with drift

KTL_002_C5_14: Consider the following AR(2) model. What is the optimal 2-step ahead forecast for y if all information available is up to and including time t, if the values of y at time t, t-1 and t-2 are –0.3, 0.4 and –0.1 respectively, and the value of u at time t-1 is 0.3?

○ -0.1
○ 0.27
● -0.34
○ 0.30

KTL_002_C5_15: What is the optimal three-step ahead forecast from the AR(2) model given in question 14?
○ -0.1
○ 0.27
○ -0.34
● -0.31

KTL_002_C5_16: Suppose you had to guess at the most likely value of a one hundred step-ahead forecast for the AR(2) model given in question 14 – what would your forecast be?
○ -0.1
○ 0.7
● –0.27
○ 0.75

KTL_002_C6_1: Which of the following are characteristics of vector autoregressive (VAR) models?

(i) They are typically a-theoretical and data driven
(ii) they can easily lead to overfitting
(iii) all variables on the right hand side of the equation are pre-determined
(iv) their interpretation is often difficult from a theoretical perspective

● (i), (ii), (iii) and (iv)
○ (i), (ii), and (iv) only
○ (i) and (ii) only
○ (i) and (iv) only

For questions 2 and 3, consider the following set of simultaneous equations:

\begin{aligned} &Y_{1 t}=\alpha_{0}+\alpha_{1} Y_{2 t}+\alpha_{2} Y_{3 t}+\alpha_{4} X_{1 t}+u_{1 t}\ &Y_{2 t}=\beta_{0}+\beta_{1} Y_{1 t}+\beta_{2} X_{1 t}+\beta_{3} X_{2 t}+\beta_{4} X_{3 t}+u_{2 t}\ &Y_{3 t}=\quad \gamma_{0}+\gamma_{1} Y_{1 t}+u_{3 t} \end{aligned}

Assume that the Y’s are endogenous and the X’s exogenous variables, and that the error terms are uncorrelated.

KTL_002_C6_2: Which of the following statement is true of equation (3)?

○ According to the order condition, it is not identified
○ According to the order condition, it is just identified
● According to the order condition, it is over-identified
○ There is insufficient information given in the question to determine whether the equation is identified or not.

KTL_002_C6_3: Estimation of equation (2) on its own using OLS would result in

○ Consistent and unbiased coefficient estimates
○ Consistent coefficient estimates which might be biased in small samples
○ Inconsistent but unbiased coefficient estimates
● Coefficient estimates that are neither unbiased nor consistent.

KTL_002_C6_4: Which of the following statements is incorrect?

○ Equations that are part of a recursive system can be validly estimated using OLS
○ Unnecessary use of two-stage least squares (2SLS) – i.e. on a set of right hand side variables that are in fact exogenous – will result in consistent but inefficient coefficient estimates.
○ 2SLS is just a special case of instrumental variables (IV) estimation.
● 2SLS and indirect least squares (ILS) are equivalent for over-identified systems.

KTL_002_C6_5: Which of the following could be viewed as a disadvantage of the vector autoregressive (VAR) approach to modelling?

○ We do not need to specify which variables are endogenous and which are exogenous
○ Standard form VARs can be estimated equation-by-equation using OLS
● VARs often contain a large number of terms
○ VARs can be expressed using a very compact notation.

KTL_002_C6_6: Consider the following bivariate VAR(2):

\begin{aligned} &y_{1 t}=\alpha_{10}+\alpha_{11} y_{1 t-1}+\alpha_{12} y_{1 t-2}+\alpha_{13} y_{2 t-1}+\alpha_{14} y_{2 t-2}+u_{1 t} \ &y_{2 t}=\alpha_{20}+\alpha_{21} y_{1 t-1}+\alpha_{22} y_{1 t-2}+\alpha_{23} y_{2 t-1}+\alpha_{24} y_{2 t-2}+u_{2 t} \end{aligned}

Which of the following coefficient significances are required to be able to say that y1 Granger-causes y2 but not the other way around?

○ α13α13 and α14α14 significant; α21α21 and α22α22 not significant
● α21α21 and α22α22 significant; α13α13 and α14α14 not significant
○ α21α21 and α23α23 significant; α11α11 and α13α13 not significant
○ α11α11 and α13α13 significant; α21α21 and α23α23 not significant

Q $\alpha_{13}$ and $\alpha_{14}$ significant; $\alpha_{21}$ and $\alpha_{22}$ not significant <ul>  	<li>$\alpha_{21}$ and $\alpha_{22}$ significant; $\alpha_{13}$ and $\alpha_{14}$ not significant Q $\alpha_{21}$ and $\alpha_{23}$ significant; $\alpha_{11}$ and $\alpha_{13}$ not significant O $\alpha_{11}$ and $\alpha_{13}$ significant; $\alpha_{21}$ and $\alpha_{23}$ not significant

KTL_002_C6_7: Which of the following statements is true concerning VAR impulse response functions?
(i) Impulse responses help the researcher to investigate the interactions between the variables in the VAR.
(ii) An impulse response analysis is where we examine the effects of applying unit shocks to all of the variables at the same time.
(iii) Impulse responses involve calculating the proportion of the total forecast error variance of a given variable is explained by innovations to each variable.
(iv) If the standard error bars around the impulse responses for a given lag span (i.e. include) the x-axis, it would be said that the response is statistically significant.

○ (i), (ii), (iii), and (iv)
○ (i), (ii) and (iii) only
○ (i) only
● (i) and (ii) only

KTL_002_C6_8: In the context of simultaneous equations modelling, which of the following statements is true concerning an exogenous variable?
○ The values of exogenous variables are determined within the system
● The exogenous variables are assumed to be fixed in repeated samples
○ Reduced form equations will not contain any exogenous variables on the RHS
○ Reduced form equations will contain only exogenous variables on the LHS

KTL_002_C6_9: Comparing the information criteria approach with the likelihood ratio test approach to determining the optimal VAR lag length, which one of the following statements is true?
○ The choice of stiffness of penalty term will not affect the model choice
○ The validity of information criteria relies upon normal residuals
● Conducting a likelihood ratio test could lead to a sub-optimal model selection
○ An application of the univariate information criteria to each equation will give identical results to the application of a multivariate version of the criteria to all of the equations jointly

KTL_002_C6_10: The second stage in two-sage least squares estimation of a simultaneous system would be to
○ Estimate the reduced form equations
● Replace the endogenous variables that are on the RHS of the structural equations with their reduced form fitted values
○ Replace all endogenous variables in the structural equations with their reduced form fitted values
○ Use the fitted values of the endogenous variables from the reduced forms as additional variables in the structural equations.

KTL_002_C7_1: Which of the following are probably valid criticisms of the Dickey Fuller methodology?

(i) The tests have a unit root under the null hypothesis and this may not be rejected due to insufficient information in the sample
(ii) the tests are poor at detecting a stationary process with a unit root close to the non-stationary boundary
(iii) the tests are highly complex to calculate in practice
(iv) the tests have low power in small samples

○ (i), (ii), (iii) and (iv)
● (i), (ii), and (iv) only
○ (i) and (iii) only
○ (ii) only

KTL_002_C7_2: Which of the following are problems associated with the Engle-Granger approach to modelling using cointegrated data?
(i) The coefficients in the cointegrating relationship are hard to calculate
(ii) This method requires the researcher to assume that one variable is the dependent variable and the others are independent variables
(iii) The Engle-Granger technique can only detect one cointegrating relationship
(iv) Engle-Granger does not allow the testing of hypotheses involving the actual cointegrating relationship.

○ (i), (ii), (iii), and (iv)
● (ii), (iii) and (iv) only
○ (ii), (iii) only
○ (ii) and (iv) only

KTL_002_C7_3: Consider the following vector error correction (VECM) model:

\Delta y_{t}=\Pi y_{t-5}+\Gamma_{1} \Delta y_{t-1}+\Gamma_{2} \Delta y_{t-2}+\Gamma_{3} \Delta y_{t-3}+\Gamma_{4} \Delta y_{t-4}+u_{t}

where $y_{t}$ is a $\mathrm{k} \times 1$ vector of variables, and $u_{t}$ is a $\mathrm{k} \times 1$ vector of disturbances. Which of the following statements is true of the VECM? O Johansen's test for cointegration centres on the rank of the matrix $\Gamma_{1}$ O If the variables yt are cointegrated, $\Pi$ will be of full rank Q If the rank of $\Pi$ is zero, the variables are cointegrated <ul>  	<li>Provided that all of the series in $\mathrm{y}$ are nonstationary, the rank of $\Pi$ can be at most $\mathrm{k}-1$. KTL_002_C7_4: Consider the following matrix: $X=\left[\begin{array}{ll}3 & 6 \ 1 & 2\end{array}\right]$

What are its characteristic roots?
● 5 and 0
○ 5 and 5
○ 3 and 2
○ 0 and 0

KTL_002_C7_5: You have the following data for Johansen’s λmax rank test for cointegration between 4 international equity market indices:

rλmax5% Critical Value

How many cointegrating vectors are there?

○ 0
○ 1
● 2
○ 3

KTL_002_C7_6: Which criticism of Dickey-Fuller (DF) -type tests is addressed by stationarity tests, such as the KPSS test?
● DF tests have low power to reject the null hypothesis of a unit root, particularly in small samples.
○ DF tests are always over-sized.
○ DF tests do not allow the researcher to test hypotheses about the cointegrating vector.
○ DF tests can only find at most one cointegrating relationship.

KTL_002_C7_7: Consider the following data generating process for a series yt: yt=μ+1.5yt–1+ut

Which one of the following most accurately describes the process for yt?

○ A random walk with drift
○ A non-stationary process
○ A deterministic trend process
● An explosive process.

KTL_002_C7_8: Which one of the following best describes most series of asset prices?
○ An independently and identically distributed (iid, i.e. “completely random”) process
● A random walk with drift
○ An explosive process
○ A deterministic trend process

KTL_002_C7_9: If there are three variables that are being tested for cointegration, what is the maximum number of linearly independent cointegrating relationships that there could be?
○ 0
○ 1
● 2
○ 3

KTL_002_C7_10: If the number of non-zero eigenvalues of the pi matrix under a Johansen test is 2, this implies that
● There are 2 linearly independent cointegrating vectors
○ There are at most 2 linearly independent cointegrating vectors
○ There are 3 variables in the system
○ There are at least 2 linearly independent cointegrating vectors

KTL_002_C7_11: If a Johansen “max” test for a null hypothesis of 1 cointegrating vectors is applied to a system containing 4 variables is conducted, which eigenvalues would be used in the test?
○ The largest 1
● The Second largest
○ The Second smallest
○ The smallest

KTL_002_C7_12: Consider the testing of hypotheses concerning the cointegrating vector(s) under the Johansen approach. Which of the following statements is correct?
○ If the restriction is (are) rejected, the number of cointegrating vectors will rise
○ If the restriction(s) is (are) rejected, the number of eigenvalues will fall
○ Whether the restriction is supported by the data or not, the eigenvalues are likely to change at least slightly upon imposing the restriction(s)
● All linear combinations of the cointegrating vectors are themselves cointegrating vectors

KTL_002_C8_1: What would typically be the shape of the news impact curve for a series that exactly followed a GARCH (1,1) process?

○ It would be asymmetric, with a steeper curve on the left than the right
○ It would be asymmetric, with a steeper curve on the right than the left
● It would be symmetric about zero
○ It would be discontinuous about zero

KTL_002_C8_2: Which of the following are NOT features of an IGARCH(1,1) model?
(i) Forecasts of the conditional variance will converge upon the unconditional variance as the horizon tends to infinity
(ii) The sum of the coefficients on the lagged squared error and the lagged conditional variance will be unity
(iii) Forecasts of the conditional variance will decline gradually towards zero as the horizon tends to infinity
(iv) Such models are never observed in reality

● (ii) only
○ (ii) and (iv) only
○ (ii), (iii) and (iv) only
○ (i), (ii), (iii) and (iv)

KTL_002_C8_3: Which of the following would represent the most appropriate definition for implied volatility?
● It is the volatility of the underlying asset’s returns implied from the price of a traded option and an option pricing model
○ It is the volatility of the underlying asset’s returns implied from a statistical model such as GARCH
○ It is the volatility of an option price implied from a statistical model such as GARCH
○ It is the volatility of an option price implied from the underlying asset volatility

KTL_002_C8_41: Suppose that a researcher wanted to obtain an estimate of realised (“actual”) volatility. Which one of the following is likely to be the most accurate measure of volatility of stock returns for a particular day?
○ The price range (high minus low) on that day
○ The squared return on that day
● The sum of the squares of hourly returns on that day
○ The squared return on the previous day

KTL_002_C8_5: Suppose that a researcher wishes to test for calendar (seasonal) effects using a dummy variables approach. Which of the following regressions could be used to examine this?
(i) A regression containing intercept dummies
(ii) A regression containing slope dummies
(iii) A regression containing intercept and slope dummies
(iv) A regression containing a dummy variable taking the value 1 for one observation and zero for all others

○ (ii) and (iv) only
○ (i) and (iii) only
● (i), (ii), and (iii) only
○ (i), (ii), (iii), and (iv).

KTL_002_C8_6: Which of the following is the most plausible test regression for determining whether a series y contains “ARCH effects”?

$0 y_{t}^{2}=\alpha_{0}+\alpha_{1} y_{t-1}+\alpha_{2} y_{t-2}+\alpha_{3} y_{t-3}+\alpha_{4} y_{t-4}+\alpha_{5} y_{t-5}+u_{t}$ <ul>  	<li>$y_{t}^{2}=\alpha_{0}+\alpha_{1} y_{t-1}^{2}+\alpha_{2} y_{t-2}^{2}+\alpha_{3} y_{t-3}^{2}+\alpha_{4} y_{t-4}^{2}+\alpha_{5} y_{t-5}^{2}+u_{t}$ ० $y_{t}=\alpha_{0}+\alpha_{1} y_{t-1}^{2}+\alpha_{2} y_{t-2}^{2}+\alpha_{3} y_{t-3}^{2}+\alpha_{4} y_{t-4}^{2}+\alpha_{5} y_{t 5}^{2}+u_{t}$ ० $y_{t}=\alpha_{0}+\alpha_{1} y_{t-1}^{2}+\alpha_{2} y_{t-2}^{3}+\alpha_{3} y_{t-3}^{4}+\alpha_{4} y_{t-4}^{5}+\alpha_{5} y_{t 5}^{6}+u_{t}$ KTL_002_C8_7: Consider the following conditional variance equation for a GJR model. $h_{t}=\alpha_{0}+\alpha_{1} u_{t-1}^{2}+\beta h_{t-1}+\gamma u_{t-1}^{2} I_{t-1}$ where $I_{t-1}=1$ if $u_{t-1}<0$ $=0$ otherwise For there to be evidence of a leverage effect, which one of the following conditions must hold? $Q \alpha_{0}$ positive and statistically significant</li>  	<li>$\gamma$ positive and statistically significant $\mathrm{Q} \gamma$ statistically significantly greater than $\alpha_{0}$ ० $\alpha_{1}+\beta \$$ statisticallysigni ficantlylessthan $\backslash(\gamma$

KTL_002_C8_8: Consider the three approaches to conducting hypothesis tests under the maximum likelihood framework. Which of the following statements are true?
(i) The Wald test is based on estimation only under the null hypothesis
(ii) The likelihood ratio test is based on estimation under both the null and the alternative hypotheses
(iii) The lagrange multiplier test is based on estimation under the alternative hypothesis only
(iv) The usual t and F-tests are examples of Wald tests

● (ii) and (iv) only
○ (i) and (iii) only
○ (i), (ii), and (iv) only
○ (i), (ii), (iii), and (iv)

KTL_002_C8_9: If a series possesses the “Markov property”, what would this imply?
(i) The series is path-dependent
(ii) All that is required to produce forecasts for the series is the current value of the series plus a transition probability matrix
(iii) The state-determining variable must be observable
(iv) The series can be classified as to whether it is in one regime or another regime, but it can only be in one regime at any one time

● (ii) only
○ (i) and (ii) only
○ (i), (ii), and (iii) only
○ (i), (ii), (iii), and (iv)

KTL_002_C8_10: Which one of the following problems in finance could not be usefully addressed by either a univariate or a multivariate GARCH model?
○ Producing option prices
○ Producing dynamic hedge ratios
○ Producing time-varying beta estimates for a stock
● Producing forecasts of returns for use in trading models
○ Producing correlation forecasts for value at risk models

180 câu trắc nghiệm Kinh tế lượng

Tổng hợp 180 câu trắc nghiệm Kinh tế lượng cơ bản trong tài chính bằng tiếng anh (có đáp án kèm theo). Nội dung được phân thành 18 chương, với nội dung cụ thể của các chương như sau:

  • Chapter 1: Câu hỏi và dữ liệu nghiên cứu kinh tế
  • Chapter 2: Ôn tập về lý thuyết xác suất
  • Chapter 3: Ôn tập về lý thuyết thống kê
  • Chapter 4: Hồi quy tuyến tính giản đơn
  • Chapter 5: Hồi quy tuyến tính giản đơn: kiểm định giả thuyết và khoảng tin cậy
  • Chapter 6: Hồi quy tuyến tính đa biến
  • Chapter 7: Kiểm định giả thuyết và khoảng tin cậy trong hồi quy tuyến tính đa biến
  • Chapter 8: Các hàm hồi quy phi tuyến
  • Chapter 9: Hồi quy tuyến tính đa biến trong nghiên cứu
  • Chapter 10: Hồi quy với dữ liệu bảng
  • Chapter 11: Hồi quy với biến phụ thuộc dạng nhị phân
  • Chapter 12: Hồi quy với biến đại diện
  • Chapter 13: Các nghiên cứu thực nghiệm và bán thực nghiệm
  • Chapter 14: Hồi quy với dữ liệu thời gian và dự báo
  • Chapter 15: Phân tích quan hệ nhân quả
  • Chapter 16: Các chủ đề khác về hồi quy dữ liệu thời gian
  • Chapter 17: Lý thuyết hồi quy tuyến tính giản đơn
  • Chapter 18: Lý thuyết hồi quy tuyến tính đa biến

Tất cả các câu hỏi được tổng hợp thích hợp trong việc ôn tập các kiến thức kinh tế lượng, cũng như sưu tập tổng hợp ngân hàng câu hỏi. các câu hỏi bao gồm các thuật ngữ kinh tế (và kinh tế lượng) cơ bản, rất đơn giản nên các bạn có thể dễ dàng hiểu được nội dung và mở rộng vốn từ kinh tế lượng chuyên ngành.

Chapter 1: Economic Questions and Data

KTL_001_C1_1: Analyzing the behavior of unemployment rates across U.S. states in March of 2010 is an example of using
○ time series data.
○ panel data.
● cross-sectional data.
○ experimental data.

KTL_001_C1_2: Studying inflation in the United States from 1970 to 2010 is an example of using
○ randomized controlled experiments.
● time series data.
○ panel data.
○ cross-sectional data.

KTL_001_C1_3: Analyzing the effect of minimum wage changes on teenage employment across the 48 contiguous U.S. states from 1980 to 2010 is an example of using
○ time series data.
● panel data.
○ having a treatment group vs. a control group, since only teenagers receive minimum wages.
○ cross-sectional data.

KTL_001_C1_4: Econometrics can be defined as follows with the exception of
○ the science of testing economic theory.
○ fitting mathematical economic models to real-world data.
○ a set of tools used for forecasting future values of economic variables.
● measuring the height of economists.

KTL_001_C1_5: The accompanying graph

is an example of
○ experimental data.
○ cross-sectional data.
● a time series.
○ longitudinal data.

KTL_001_C1_6: One of the primary advantages of using econometrics over typical results from economic theory, is that
● it potentially provides you with quantitative answers for a policy problem rather than simply suggesting the direction (positive/negative) of the response.
○ teaching you how to use statistical packages.
○ learning how to invert a 4 by 4 matrix.
○ all of the above.

KTL_001_C1_7: In a randomized controlled experiment
○ you control for the effect that random numbers are not truly randomly generated
● there is a control group and a treatment group.
○ you control for random answers.
○ the control group receives treatment on even days only.

KTL_001_C1_8: The reason why economists do not use experimental data more frequently is for all of the following reasons except that real-world experiments
○ with humans are difficult to administer.
○ are often unethical.
● cannot be executed in economics.
○ have flaws relative to ideal randomized controlled experiments.

KTL_001_C1_9: The most frequently used experimental or observational data in econometrics are of the following type:
○ randomly generated data.
○ time series data.
○ panel data.
● cross-sectional data.

KTL_001_C1_10: In the graph below, the vertical axis represents average real GDP growth for 65 countries over the period 1960-1995, and the horizontal axis shows the average trade share within these countries.

This is an is an example of
○ experimental data.
● cross-sectional data.
○ a time series.
○ longitudinal data.

Chapter 2: Review of Probability

KTL_001_C2_1: The expected value of a discrete random variable
○ is the outcome that is most likely to occur.
○ can be found by determining the 50% value in the c.d.f.
○ equals the population median.
● is computed as a weighted average of the possible outcome of that random variable, where the weights are the probabilities of that outcome.

KTL_001_C2_2: For a normal distribution, the skewness and kurtosis measures are as follows:
○ 1.96 and 4
○ 0 and 0
● 0 and 3
○ 1 and 2

KTL_001_C2_3: The correlation between X and Y
○ cannot be negative since variances are always positive.
○ is the covariance squared.
● can be calculated by dividing the covariance between X and Y by the product of the two standard deviations.

KTL_001_C2_8: Assume that you assign the following subjective probabilities for your final grade in your econometrics course (the standard GPA scale of 4 = A to 0 = F applies):


The expected value is:
○ 3.0
○ 3.5
● 2.78
○ 3.25

KTL_001_C2_9: The mean and variance of a Bernoulli random variable are given as
○ cannot be calculated
○ np and np(1-p)

O $\mathrm{p}$ and $p \sqrt{(1-p)}$ $\bullet p$ and $(1-p)$ KTL_001_C2_10: Consider the following linear transformation of a random variable $y=\frac{x-\mu_{z}}{\sigma_{x}}$, where $\mu_{x}$ is the mean of $\mathrm{x}$ and $\sigma_{x}$ is the standard deviation. Then the expected value and the standard deviation of $\mathrm{Y}$ are given as <ul>  	<li>0 and 1 $\bigcirc 1$ and 1 O Cannot be computed because $Y$ is not a linear function of $X$ $O \frac{\mu_{x}}{\sigma_{x}}$ and $\sigma_{x}$

Chapter 3: Review of Statistics

KTL_001_C3_1: An estimator ^μY of the population value μY is consistent if
○ ^μY→μYμ^Y→μY.
○ its mean square error is the smallest possible.
○ Y is normally distributed.
● ¯Y→0Y¯→0.

KTL_001_C3_2: A type II error is
○ typically smaller than the type I error.
○ the error you make when choosing type II or type I.
● the error you make when not rejecting the null hypothesis when it is false.
○ cannot be calculated when the alternative hypothesis contains an “=”.

KTL_001_C3_3: A large p-value implies
○ rejection of the null hypothesis.
● a large t-statistic.
○ a large ¯Yact.
○ that the observed value ¯Yact is consistent with the null hypothesis.

KTL_001_C3_4: The power of the test
○ is the probability that the test actually incorrectly rejects the null hypothesis when the null is true.
● depends on whether you use Y¯ or ¯Y2 for the t-statistic.
○ is one minus the size of the test.
○ is the probability that the test correctly rejects the null when the alternative is true.

KTL_001_C3_5: The following statement about the sample correlation coefficient is true.

$\mathrm{O}-1 \leq r_{X Y} \leq 1$ О $r_{X Y} \rightarrow \operatorname{corr}\left(X_{i}, Y_{i}\right)$ $\bigcirc\left|r_{X Y}\right| \leq 1$ <ul>  	<li>$r_{X Y}=\frac{S_{X Y}^{2}}{S_{X}^{2} * S_{Y}^{2}}$ KTL_001_C3_6: When testing for differences of means, the t-statistic $t=\frac{Y_{m}-Y_{w}}{S E\left(Y_{m}-Y_{w}\right)}$, where $S E\left(\bar{Y}<em>{m}-\bar{Y}</em>{w}\right)=\sqrt{\frac{S_{m}^{2}}{n_{m}}+\frac{S_{w}^{2}}{n_{w}}}$ has

○ a Student t distribution if the population distribution of Y is not normal
○ a student t distribution if the population distribution of Y is normal
○ a normal distribution even in small samples
● cannot be computed unless nm=nw

KTL_001_C3_7: When testing for differences of means, you can base statistical inference on the
● Student t distribution in general
○ Normal distribution regardless of sample size
○ Student t distribution if the underlying population distribution of Y is normal, the two groups have the same variances, and you use the pooled standard error formula
○ Chi-squared distribution with (nm+nw–2) degrees of freedom

KTL_001_C3_8: Assume that you have 125 observations on the height (H) and weight (W) of your peers in college. Let SE(HW) = 68, SE(H) = 3.5, SE(W) = 68. The sample correlation coefficient is
○ 1.22
● 0.50
○ 0.67
○ Cannot be computed since males and females have not been separated out.

KTL_001_C3_9: You have collected data on the average weekly amount of studying time (T) and grades (G) from the peers at your college. Changing the measurement from minutes into hours has the following effect on the correlation coefficient:
○ decreases the r(TG) by dividing the original correlation coefficient by 60
○ results in a higher r(TG)
● cannot be computed since some students study less than an hour per week
○ does not change the r(TG)

KTL_001_C3_10: A low correlation coefficient implies that
○ the line always has a flat slope
○ in the scatterplot, the points fall quite far away from the line
● the two variables are unrelated
○ you should use a tighter scale of the vertical and horizontal axis to bring the observations closer to the line

Chapter 4: Linear Regression with One Regressor

KTL_001_C4_1: Binary variables
○ are generally used to control for outliers in your sample.
○ can take on more than two values.
○ exclude certain individuals from your sample.
● can take on only two values.

KTL_001_C4_2: In the simple linear regression model, the regression slope
○ indicates by how many percent Y increases, given a one percent increase in X.
○ when multiplied with the explanatory variable will give you the predicted Y.
● indicates by how many units Y increases, given a one unit increase in X.
○ represents the elasticity of Y on X.

KTL_001_C4_3: The regression R2 is a measure of
○ whether or not X causes Y.
● the goodness of fit of your regression line.
○ whether or not ESS > TSS.
○ the square of the determinant of R.

KTL_001_C4_4: In the simple linear regression model Yi=β0+β1Xi+ui,

○ the intercept is typically small and unimportant.
● β0+β1Xi represents the population regression function.
○ the absolute value of the slope is typically between 0 and 1.
○ β0+β1Xi represents the sample regression function.

KTL_001_C4_5: E(ui|Xi)=0E(ui|Xi)=0 says that
○ dividing the error by the explanatory variable results in a zero (on average).
○ the sample regression function residuals are unrelated to the explanatory variable.
○ the sample mean of the Xs is much larger than the sample mean of the errors.
● the conditional distribution of the error given the explanatory variable has a zero mean.

KTL_001_C4_6: Assume that you have collected a sample of observations from over 100 households and their consumption and income patterns. Using these observations, you estimate the following regression Ci=β0+β1Xi+ui, where C is consumption and Y is disposable income. The estimate of β1 will tell you

( $\frac{\Delta \text { income }}{\Delta \text { consumption }}$ Q The amount you need to consume to survive $O$ Consumption / Income $\cdot \frac{\Delta \text { consumption }}{\Delta \text { income }}$

KTL_001_C4_7: In which of the following relationships does the intercept have a real-world interpretation?
● the relationship between the change in the unemployment rate and the growth rate of real
GDP (“Okun’s Law”)
○ the demand for coffee and its price
○ test scores and class-size
○ weight and height of individuals

KTL_001_C4_8: The OLS residuals, ^uiu^i, are sample counterparts of the population
○ regression function slope
● errors
○ regression function’s predicted values
○ regression function intercept

KTL_001_C4_9: Changing the units of measurement, e.g. measuring test scores in 100s, will do all of the following EXCEPT for changing the
○ residuals
○ numerical value of the slope estimate
● interpretation of the effect that a change in X has on the change in Y
○ numerical value of the intercept

KTL_001_C4_10: To decide whether the slope coefficient indicates a “large” effect of X on Y, you look at the
○ size of the slope coefficient
○ regression R2
● economic importance implied by the slope coefficient
○ value of the intercept

Chapter 5: Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

KTL_001_C5_1: The t-statistic is calculated by dividing
○ the OLS estimator by its standard error.
○ the slope by the standard deviation of the explanatory variable.
● the estimator minus its hypothesized value by the standard error of the estimator.
○ the slope by 1.96.

KTL_001_C5_2: Imagine that you were told that the t-statistic for the slope coefficient of the regression line ^Y = 698.9 – 2.28*STR was 4.38. What are the units of measurement for the t-statistic?
○ points of the test score.
○ number of students per teacher.
○ TestScore / STR
● standard deviations

KTL_001_C5_3: The 95% confidence interval for β1 is the interval

\begin{aligned} &\circ \beta_{1}-1.96 S E\left(\beta_{1}\right) ; \beta_{1}+1.96 S E\left(\beta_{1}\right) \ &O \hat{\beta}<em>{1}-1.645 S E\left(\hat{\beta}</em>{1}\right) ; \hat{\beta}<em>{1}+1.645 S E\left(\hat{\beta}</em>{1}\right) \ &\bullet \hat{\beta}<em>{1}-1.96 S E\left(\hat{\beta}</em>{1}\right) ; \hat{\beta}<em>{1}+1.96 S E\left(\hat{\beta}</em>{1}\right) \ &\bigcirc \hat{\beta}<em>{1}-1.96 ; \hat{\beta}</em>{1}+1.96 \end{aligned}

KTL_001_C5_4: A binary variable is often called a
● dummy variable
○ dependent variable
○ residual
○ power of a test

KTL_001_C5_5: If the errors are heteroskedastic, then
● WLS is BLUE if the conditional variance of the errors is known up to a constant factor of proportionality
○ LAD is BLUE if the conditional variance of the errors is known up to a constant factor of proportionality
○ OLS is efficient

KTL_001_C5_6: Using the textbook example of 420 California school districts and the regression of test scores on the student-teacher ratio, you find that the standard error on the slope coefficient is 0.51 when using the heteroskedasticity robust formula, while it is 0.48 when employing the homoskedasticity only formula. When calculating the t-statistic, the recommended procedure is to
○ use the homoskedasticity only formula because the t-statistic becomes larger
○ first test for homoskedasticity of the errors and then make a decision
● use the heteroskedasticity robust formula
○ make a decision depending on how much different the estimate of the slope is under the two procedures

KTL_001_C5_7: Consider the estimated equation from your textbook

^Y= 698.9 – 2.28*STR
(10.4)     (0.52)
R2 = 0.051, SER = 18.6

The t-statistic for the slope is approximately
● 4.38
○ 67.20
○ 0.52
○ 1.76

KTL_001_C5_8: You have collected data for the 50 U.S. states and estimated the following relationship between the change in the unemployment rate from the previous year (^YY^) and the growth rate of the respective state real GDP (gdp). The results are as follows

^Y= 2.81 – 0.23 * gdp,
(0.12)    (0.04)
\({R^2}\ = 0.36, SER = 0.78

Assuming that the estimator has a normal distribution, the 95% confidence interval for the slope is approximately the interval
○ [2.57, 3.05] ○ [-0.31,0.15] ● [-0.31, -0.15] ○ [-0.33, -0.13]

KTL_001_C5_9: Using 143 observations, assume that you had estimated a simple regression function and that your estimate for the slope was 0.04, with a standard error of 0.01. You want to test whether or not the estimate is statistically significant. Which of the following decisions is the only correct one:
○ you decide that the coefficient is small and hence most likely is zero in the population
● the slope is statistically significant since it is four standard errors away from zero
○ the response of Y given a change in X must be economically important since it is statistically significant
○ since the slope is very small, so must be the regression R2.

KTL_001_C5_10: You extract approximately 5,000 observations from the Current Population Survey (CPS) and estimate the following regression function:

^Y= 3.32 – 0.45 * Age,
(1.00)   (0.04)
\({R^2}\= 0.02, SER = 8.66

where Y is average hourly earnings, and Age is the individual’s age. Given the specification, your 95% confidence interval for the effect of changing age by 5 years is approximately

● [$1.96, $2.54] ○ [$2.32, $4.32] ○ [$1.35, $5.30] ○ cannot be determined given the information provided.

Chapter 6: Linear Regression with Multiple Regressors

KTL_001_C6_1: In the multiple regression model, the adjusted R2R2 or ¯R2R¯2
○ cannot be negative.
● will never be greater than the regression R2.
○ equals the square of the correlation coefficient r.
○ cannot decrease when an additional explanatory variable is added.

KTL_001_C6_2: If you had a two regressor regression model, then omitting one variable which is relevant
○ will have no effect on the coefficient of the included variable if the correlation between the excluded and the included variable is negative.
● will always bias the coefficient of the included variable upwards.
○ can result in a negative value for the coefficient of the included variable, even though the coefficient will have a significant positive effect on Y if the omitted variable were included.
○ makes the sum of the product between the included variable and the residuals different from 0.

KTL_001_C6_3: Under the least squares assumptions for the multiple regression problem (zero conditional mean for the error term, all Xi and Yi being i.i.d., all Xi and uhaving finite fourth moments, no perfect multicollinearity), the OLS estimators for the slopes and intercept
○ have an exact normal distribution for n > 25.
○ are BLUE.
○ have a normal distribution in small samples as long as the errors are homoskedastic.
● are unbiased and consistent.

KTL_001_C6_4: The following OLS assumption is most likely violated by omitted variables bias: $\bullet E\left(u_{i} \mid X_{i}\right)=0$ $O X_{i}, Y_{i}$ with $I=1, \ldots, n$ are i.i.d draws from their joint distribution $\mathrm{O}$ there are no outliers for $X_{i}, u_{i}$ $O$ there is heteroskedasticity KTL_001_C6_5: The adjusted $R^{2}$ or $\bar{R}^{2}$, is given by О $1-\frac{n-2}{n-k-1} \frac{S S R}{T S S}$ О $1-\frac{n-1}{n-k-1} \frac{E S S}{T S S}$ <ul>  	<li>$1-\frac{n-1}{n k-1} \frac{S S R}{T S S}$ O $\frac{\text { ESS }}{\text { TSS }}$ KTL_001_C6_6: Consider the multiple regression model with two regressors $X_{1}, X_{2}$, where both variables are determinants of the dependent variable. When omitting $X_{2}$ from the regression, there will be omitted variable bias for $\hat{\beta}_{1}$</li>  	<li>if $X_{1}, X_{2}$ are correlated O always O if $X_{2}$ is measured in percentages O only if $X_{2}$ is a dummy variable

KTL_001_C6_7: The dummy variable trap is an example of
○ imperfect multicollinearity
○ something that is of theoretical interest only
● perfect multicollinearity
○ something that does not happen to university or college students

KTL_001_C6_8: Imperfect multicollinearity
○ is not relevant to the field of economics and business administration
○ only occurs in the study of finance
● means that the least squares estimator of the slope is biased
○ means that two or more of the regressors are highly correlated

KTL_001_C6_9: Consider the multiple regression model with two regressors $X_{1}, X_{2}$, where both variables are determinants of the dependent variable. You first regress $Y$ on $X_{1}$ only and find no relationship. However when regressing $\mathrm{Y}$ on $X_{1}, X_{2}$, the slope coefficient $\hat{\beta}_{1}$ changes by a large amount. This suggests that your

first regression suffers from
○ heteroskedasticity
○ perfect multicollinearity
● omitted variable bias
○ dummy variable trap

KTL_001_C6_10: Imperfect multicollinearity
● implies that it will be difficult to estimate precisely one or more of the partial effects using the data at hand
○ violates one of the four Least Squares assumptions in the multiple regression model
○ means that you cannot estimate the effect of at least one of the Xs on Y
○ suggests that a standard spreadsheet program does not have enough power to estimate the multiple regression model

Chapter 7: Hypothesis Tests and Confidence Intervals in Multiple Regression

KTL_001_C7_1: When testing joint hypothesis, you should
○ use t-statistics for each hypothesis and reject the null hypothesis is all of the restrictions fail
○ use the F-statistic and reject all the hypothesis if the statistic exceeds the critical value
○ use t-statistics for each hypothesis and reject the null hypothesis once the statistic exceeds the critical value for a single hypothesis
● use the F-statistics and reject at least one of the hypothesis if the statistic exceeds the critical value

KTL_001_C7_2: In the multiple regression model, the t-statistic for testing that the slope is significantly different from zero is calculated
● by dividing the estimate by its standard error.
○ from the square root of the F-statistic.
○ by multiplying the p-value by 1.96.
○ using the adjusted R2 and the confidence interval.

KTL_001_C7_3: If you wanted to test, using a 5% significance level, whether or not a specific slope coefficient is equal to one, then you should
● subtract 1 from the estimated coefficient, divide the difference by the standard error, and check if the resulting ratio is larger than 1.96.
○ add and subtract 1.96 from the slope and check if that interval includes 1.
○ see if the slope coefficient is between 0.95 and 1.05.
○ check if the adjusted R2 is close to 1.

KTL_001_C7_4: When there are two coefficients, the resulting confidence sets are
○ rectangles
● ellipses
○ squares
○ trapezoids

KTL_001_C7_5: All of the following are true, with the exception of one condition:

○ a high R2R2 or ¯R2R¯2 does not mean that the regressors are a true cause of the dependent variable
○ a high R2R2 or ¯R2R¯2 does not mean that there is no omitted variable bias
● a high R2R2 or ¯R2R¯2 always means that an added variable is statistically significant
○ a high R2R2 or ¯R2R¯2 does not necessarily mean that you have the most appropriate set of regressors

O a high $R^{2}$ or $\bar{R}^{2}$ does not mean that the regressors are a true cause of the dependent variable O a high $R^{2}$ or $\bar{R}^{2}$ does not mean that there is no omitted variable bias <ul>  	<li>a high $R^{2}$ or $\bar{R}^{2}$ always means that an added variable is statistically significant O a high $R^{2}$ or $\bar{R}^{2}$ does not necessarily mean that you have the most appropriate set of regressors KTL_001_C7_6: You have estimated the relationship between test scores and the student-teacher ratio under the assumption of homoskedasticity of the error terms. The regression output is as follows: $\hat{Y}=698.9-$ $2.28^{*} \mathrm{STR}$, and the standard error on the slope is $0.48$. The homoskedasticity-only "overall" regression $\mathrm{F}-$ statistic for the hypothesis that the Regression $R^{2}$ is zero is approximately

○ 0.96
○ 1.96
● 22.56
○ 4.75

KTL_001_C7_7: Consider a regression with two variables, in which $X_{1 i}$ is the variable of interest and $X_{2 i}$ is the control variable. Conditional mean independence requires $\bullet E\left(u_{i} \mid X_{1 i}, X_{2 i}\right)=E\left(u_{i} \mid X_{2 i}\right)$ $\circ E\left(u_{i} \mid X_{1 i}, X_{2 i}\right)=E\left(u_{i} \mid X_{1 i}\right)$ $\mathrm{O} E\left(u_{i} \mid X_{1 i}\right)=E\left(u_{i} \mid X_{2 i}\right)$ $\mathrm{O} E\left(u_{i}\right)=E\left(u_{i} \mid X_{2 i}\right)$

KTL_001_C7_8: The homoskedasticity-only F-statistic and the heteroskedasticity-robust F-statistic typically are
○ the same
● different
○ related by a linear function
○ a multiple of each other (the heteroskedasticity-robust F-statistic is 1.96 times the homoskedasticity-only F-statistic)

KTL_001_C7_9: Consider the following regression output where the dependent variable is testscores and the

two explanatory variables are the student-teacher ratio and the percent of English learners: $\hat{Y}=698.9$ $1.10^{*} S T R-0.650^{k} E L$. You are told that the $t$ -statistic on the student-teacher ratio coefficient is $2.56$. The standard error therefore is approximately $00.25$ $01.96$ $00.650$ $\bullet 0.43$ KTL_001_C7_10: The critical value of $F_{4, \infty}$ at the $5 \%$ significance level is

○ 3.84
● 2.37
○ 1.94
○ Cannot be calculated because in practice you will not have infinite number of observations

Chapter 8: Nonlinear Regression Functions

KTL_001_C8_1: The interpretation of the slope coefficient in the model $\ln \left(Y_{i}\right)=\beta_{0}+\beta_{1} \ln \left(X_{i}\right)+u_{i}$ is as follows: a <ul>  	<li>$1 \%$ change in $\mathrm{X}$ is associated with a $\beta_{1} \%$ change in $\mathrm{Y}$. $O$ change in $X$ by one unit is associated with a $\beta_{1}$ change in $Y$. Q change in $\mathrm{X}$ by one unit is associated with a $100 \beta_{1} \%$ change in $\mathrm{Y}$. Q $1 \%$ change in $X$ is associated with a change in $Y$ of $0.01 \beta_{1}$.

KTL_001_C8_2: A nonlinear function
○ makes little sense, because variables in the real world are related linearly.
○ can be adequately described by a straight line between the dependent variable and one of the explanatory variables.
○ is a concept that only applies to the case of a single or two explanatory variables since you cannot draw a line in four dimensions.
● is a function with a slope that is not constant.

KTL_001_C8_4: The best way to interpret polynomial regressions is to
○ take a derivative of Y with respect to the relevant X.
● plot the estimated regression function and to calculate the estimated effect on Y associated with a change in X for one or more values of X.
○ look at the t-statistics for the relevant coefficients.
○ analyze the standard error of estimated effect.

KTL_001_C8_5: In the log-log model, the slope coefficient indicates
○ the effect that a unit change in X has on Y.
● the elasticity of Y with respect to X.

० $\frac{\Delta Y}{\Delta X}$. O $\frac{\Delta X}{\Delta X} \frac{Y}{X}$ KTL_001_C8_6: In the model $\ln \left(Y_{i}\right)=\beta_{0}+\beta_{1} X_{i}+u_{i}$, the elasticity of $\mathrm{E}(\mathrm{Y} \mid \mathrm{X})$ with respect to $\mathrm{X}$ is <ul>  	<li>$\beta_{1} X$ $\bigcirc \beta_{1}$ ० $\frac{\beta_{1} X}{\beta_{0}+\beta_{1} X}$ O cannot be calculated because the function is non-linear KTL_001_C8_7: Assume that you had estimated the following quadratic regression model $\hat{Y}=607.3+3.85$ Income $-0.0423$ Income $^{2}$. If income increased from 10 to $11(\$ 10,000$ to $\$ 11,000)$, then

the predicted effect on test scores would be
○ 3.85
○ 3.85-0.0423
○ Cannot be calculated because the function is non-linear
● 2.96

KTL_001_C8_8: Consider the polynomial regression model of degree r,

$Y_{i}=\beta_{0}+\beta_{1} X_{i}+\beta_{2} X_{i}^{2}+\ldots+\beta_{r} X_{i}^{r}+u_{i} .$ According to the null hypothesis that the regression is linear and the alternative that is a polynomial of degree $r$ corresponds to ० $H_{0}: \beta_{r}=0$ vs. $\quad H_{1}: \beta_{r} \neq 0$ ० $H_{0}: \beta_{1}=0$ vs. $\quad H_{1}: \beta_{1} \neq 0$ $O H_{0}: \beta_{2}=0, \beta_{3}=0, \ldots, \beta_{r}=0$ vs. $\quad H_{1}:$ all $\quad \beta_{i} \neq 0, i=2, \ldots, r$ <ul>  	<li>$H_{0}: \beta_{2}=0, \beta_{3}=0, \ldots, \beta_{r}=0$ vs. $\quad H_{1}:$ atleast one $\quad \beta_{i} \neq 0, i=2, \ldots, r$ KTL_001_C8_9: Consider the following least squares specification between test scores and the studentteacher ratio: $\hat{Y}=557.8+36.42 \mathrm{ln}($ Income $)$. According to this equation, a $1 \%$ increase income is associated with an

increase in test scores of
● 0.36 points
○ 36.42 points
○ 557.8 points
○ cannot be determined from the information given here

KTL_001_C8_10: Consider the population regression of log earnings [Yi, where Yi = ln(Earningsi)] against two binary variables: whether a worker is married (D1i, where D1i=1 if the ith person is married) and the worker’s gender (D2i, where D2i=1 if the ith person is female), and the product of the two binary variables

Y_{i}=\beta_{0}+\beta_{1} D_{1 i}+\beta_{2} D_{2 i}+\beta_{3}\left(D_{1 i} x D_{2 i}\right)+u_{i} \text { . The interaction term }

● allows the population effect on log earnings of being married to depend on gender
○ does not make sense since it could be zero for married males
○ indicates the effect of being married on log earnings
○ cannot be estimated without the presence of a continuous variable

Chapter 9: Assessing Studies Based on Multiple Regression

KTL_001_C9_1: A survey of earnings contains an unusually high fraction of individuals who state their weekly earnings in 100s, such as 300, 400, 500, etc. This is an example of
● errors-in-variables bias.
○ sample selection bias.
○ simultaneous causality bias.
○ companies that typically bargain with workers in 100s of dollars.

KTL_001_C9_2: In the case of a simple regression, where the independent variable is measured with i.i.d. error,

\begin{aligned} &\text { - } \hat{\beta}<em>{1} \rightarrow \frac{\sigma</em>{z}^{2}}{\sigma_{x}^{2}+\sigma_{v}^{2}} \beta_{1} \ &\circ \hat{\beta}<em>{1} \rightarrow \frac{\sigma</em>{2}^{2}}{\sigma_{x}^{2}+\sigma_{w}^{2}} \ &\circ \hat{\beta}<em>{1} \rightarrow \frac{\sigma</em>{v}^{2}}{\sigma_{x}^{2}+\sigma_{w}^{2}} \beta_{1} \ &\circ \hat{\beta}<em>{1} \rightarrow \beta</em>{1}+\frac{\sigma_{2}^{2}}{\sigma_{x}^{2}+\sigma_{v}^{2}} \end{aligned}

KTL_001_C9_3: In the case of errors-in-variables bias,
○ maximum likelihood estimation must be used.
● the OLS estimator is consistent if the variance in the unobservable variable is relatively large compared to variance in the measurement error.
○ the OLS estimator is consistent, but no longer unbiased in small samples.
○ binary variables should not be used as independent variables.

KTL_001_C9_4: Comparing the California test scores to test scores in Massachusetts is appropriate for external validity if
○ Massachusetts also allowed beach walking to be an appropriate P.E. activity.
○ the two income distributions were very similar.
○ the student-to-teacher ratio did not differ by more than five on average.
● the institutional settings in California and Massachusetts, such as organization in classroom instruction and curriculum, were similar in the two states.

KTL_001_C9_5: In the case of errors-in-variables bias, the precise size and direction of the bias depend on
○ the sample size in general.
● the correlation between the measured variable and the measurement error.
○ the size of the regression R2 .

○ whether the good in question is price elastic.

KTL_001_C9_6: The question of reliability/unreliability of a multiple regression depends on
○ internal but not external validity
○ the quality of your statistical software package
● internal and external validity
○ External but not internal validity

KTL_001_C9_7: A statistical analysis is internally valid if
○ all t-statistics are greater than |1.96|
○ the regression R2 > 0.05

○ the population is small, say less than 2,000, and can be observed
● the statistical inferences about causal effects are valid for the population studied

KTL_001_C9_8: Internal validity is that
● the estimator of the causal effect should be unbiased and consistent
○ the estimator of the causal effect should be efficient
○ inferences and conclusions can be generalized from the population to other populations
○ OLS estimation has been used in your statistical package

KTL_001_C9_9: Threats to internal validity lead to
○ perfect multicollinearity
○ the inability to transfer data sets into your statistical package
● failures of one or more of the least squares assumptions
○ a false generalization to the population of interest

KTL_001_C9_10: The true causal effect might not be the same in the population studied and the population of interest because
○ of differences in characteristics of the population
○ of geographical differences
○ the study is out of date
● all of the above

Chapter 10: Regression with Panel Data

KTL_001_C10_1: The Fixed Effects regression model
● has n different intercepts.
○ the slope coefficients are allowed to differ across entities, but the intercept is “fixed” (remains unchanged).
○ has “fixed” (repaired) the effect of heteroskedasticity.
○ in a log-log model may include logs of the binary variables, which control for the fixed effects.

KTL_001_C10_2: In the Fixed Time Effects regression model, you should exclude one of the binary variables for the time periods when an intercept is present in the equation
○ because the first time period must always excluded from your data set.
○ because there are already too many coefficients to estimate.
● to avoid perfect multicollinearity.
○ to allow for some changes between time periods to take place.

KTL_001_C10_3: When you add state fixed effects to a simple regression model for U.S. states over a certain time period, and the regression R2 increases significantly, then it is safe to assume that

○ the included explanatory variables, other than the state fixed effects, are unimportant.
● state fixed effects account for a large amount of the variation in the data.
○ the coefficients on the other included explanatory variables will not change.
○ time fixed effects are unimportant.

KTL_001_C10_4: In the panel regression analysis of beer taxes on traffic deaths, the estimation period is 1982-1988 for the 48 contiguous U.S. states. To test for the significance of entity fixed effects, you should calculate the F-statistic and compare it to the critical value from your Fq,∞ distribution, where q equals

○ 48.
○ 54.
○ 7.
● 47.

KTL_001_C10_5: In the panel regression analysis of beer taxes on traffic deaths, the estimation period is 1982-1988 for the 48 contiguous U.S. states. To test for the significance of time fixed effects, you should calculate the F-statistic and compare it to the critical value from your Fq,∞ distribution, which equals (at the 5% level)

○ 2.01.
● 2.10.
○ 2.80.
○ 2.64.

KTL_001_C10_6: Assume that for the T = 2 time periods case, you have estimated a simple regression in changes model and found a statistically significant positive intercept. This implies
○ a negative mean change in the LHS variable in the absence of a change in the RHS variable since you subtract the earlier period from the later period
○ that the panel estimation approach is flawed since differencing the data eliminates the constant (intercept) in a regression
● a positive mean change in the LHS variable in the absence of a change in the RHS variable
○ that the RHS variable changed between the two subperiods

KTL_001_C10_7: HAC standard errors and clustered standard errors are related as follows:
○ they are the same
● clustered standard errors are one type of HAC standard error
○ they are the same if the data is differenced
○ clustered standard errors are the square root of HAC standard errors

KTL_001_C10_8: In panel data, the regression error
● is likely to be correlated over time within an entity
○ should be calculated taking into account heteroskedasticity but not autocorrelation
○ only exists for the case of T > 2
○ fits all of the three descriptions above

KTL_001_C10_9: It is advisable to use clustered standard errors in panel regressions because
○ without clustered standard errors, the OLS estimator is biased
○ hypothesis testing can proceed in a standard way even if there are few entities (n is small)
○ they are easier to calculate than homoskedasticity-only standard errors
● the fixed effects estimator is asymptotically normally distributed when n is large

KTL_001_C10_10: If Xit is correlated with Xis for different values of s and t, then
● Xit is said to be autocorrelated
○ the OLS estimator cannot be computed
○ statistical inference cannot proceed in a standard way even if clustered standard errors are used
○ this is not of practical importance since these correlations are typically weak in applications

Chapter 11: Regression with a Binary Dependent Variable

KTL_001_C11_1: The linear probability model is
○ the application of the multiple regression model with a continuous left-hand side variable and a binary variable as at least one of the regressors.
○ an example of probit estimation.
○ another word for logit estimation.
● the application of the linear multiple regression model to a binary dependent variable.

KTL_001_C11_2: The probit model
○ is the same as the logit model.
○ always gives the same fit for the predicted values as the linear probability model for values between 0.1 and 0.9.
● forces the predicted values to lie between 0 and 1.
○ should not be used since it is too complicated.

KTL_001_C11_3: In the expression Pr(deny = 1| P/I Ratio, black) = F(–2.26 + 2.74P/I ratio + 0.71black), the effect of increasing the P/I ratio from 0.3 to 0.4 for a white person
○ is 0.274 percentage points.
● is 6.1 percentage points.
○ should not be interpreted without knowledge of the regression R2.
○ is 2.74 percentage points.

KTL_001_C11_4: Nonlinear least squares
● solves the minimization of the sum of squared predictive mistakes through sophisticated mathematical routines, essentially by trial and error methods.
○ should always be used when you have nonlinear equations.
○ gives you the same results as maximum likelihood estimation.
○ is another name for sophisticated least squares.

KTL_001_C11_5: To measure the fit of the probit model, you should:
○ use the regression R2.
○ plot the predicted values and see how closely they match the actuals.
○ use the log of the likelihood function and compare it to the value of the likelihood function.
● use the fraction correctly predicted or the pseudo R2.

KTL_001_C11_6: In the probit regression, the coefficient β1 indicates
○ the change in the probability of Y = 1 given a unit change in X
○ the change in the probability of Y = 1 given a percent change in X
● the change in the z- value associated with a unit change in X
○ none of the above

KTL_001_C11_7: Your textbook plots the estimated regression function produced by the probit regression of deny on P/I ratio. The estimated probit regression function has a stretched “S” shape given that the coefficient on the P/I ratio is positive. Consider a probit regression function with a negative coefficient. The shape would
● resemble an inverted “S” shape (for low values of X, the predicted probability of Y would approach 1)
○ not exist since probabilities cannot be negative
○ remain the “S” shape as with a positive slope coefficient
○ would have to be estimated with a logit function

KTL_001_C11_8: Probit coefficients are typically estimated using
○ the OLS method
● the method of maximum likelihood
○ non-linear least squares (NLLS)
○ by transforming the estimates from the linear probability model

KTL_001_C11_9: F-statistics computed using maximum likelihood estimators
○ cannot be used to test joint hypothesis
○ are not meaningful since the entire regression R2 concept is hard to apply in this situation
○ do not follow the standard F distribution
● can be used to test joint hypothesis

KTL_001_C11_10: When testing joint hypothesis, you can use
○ the F- statistic
○ the chi-squared statistic
● either the F-statistic or the chi-square statistic
○ none of the above

Chapter 12: Instrumental Variables Regression

KTL_001_C12_1: The rule-of-thumb for checking for weak instruments is as follows: for the case of a single endogenous regressor,
○ a first stage F must be statistically significant to indicate a strong instrument.
○ a first stage F > 1.96 indicates that the instruments are weak.
○ the t-statistic on each of the instruments must exceed at least 1.64.
● a first stage F < 10 indicates that the instruments are weak.

KTL_001_C12_2: The distinction between endogenous and exogenous variables is
○ that exogenous variables are determined inside the model and endogenous variables are determined outside the model.
○ dependent on the sample size: for n > 100, endogenous variables become exogenous.
○ depends on the distribution of the variables: when they are normally distributed, they are exogenous, otherwise they are endogenous.
● whether or not the variables are correlated with the error term.

KTL_001_C12_3: The TSLS estimator is
● consistent and has a normal distribution in large samples.
○ unbiased.
○ efficient in small samples.
○ F-distributed.

KTL_001_C12_4: Weak instruments are a problem because
● the TSLS estimator may not be normally distributed, even in large samples.
○ they result in the instruments not being exogenous.
○ the TSLS estimator cannot be computed.
○ you cannot predict the endogenous variables any longer in the first stage.

KTL_001_C12_5: Consider a model with one endogenous regressor and two instruments. Then the Jstatistic will be large
○ if the number of observations are very large.
● if the coefficients are very different when estimating the coefficients using one instrument at a time.
○ if the TSLS estimates are very different from the OLS estimates.
○ when you use homoskedasticity-only standard errors.

KTL_001_C12_6: Let W be the included exogenous variables in a regression function that also has endogenous regressors (X). The W variables can
○ be control variables
○ have the property E(ui|Wi)=0

○ make an instrument uncorrelated with u
● all of the above

KTL_001_C12_7: The logic of control variables in IV regressions
● parallels the logic of control variables in OLS
○ only applies in the case of homoskedastic errors in the first stage of two stage least squares estimation
○ is different in a substantial way from the logic of control variables in OLS since there are two stages in estimation
○ implies that the TSLS is efficient

KTL_001_C12_8: For W to be an effective control variable in IV estimation, the following condition must hold

\begin{aligned} &\mathrm{O} E\left(u_{i}\right)=0\ &E\left(u_{i} \mid Z_{i}, W_{i}\right)=E\left(u_{i} \mid W_{i}\right)\ &\mathrm{O} E\left(u_{i}, u_{j}\right) \neq 0 \end{aligned}

○ there must be an intercept in the regression

KTL_001_C12_9: The IV estimator can be used to potentially eliminate bias resulting from
○ multicollinearity
○ serial correlation
● errors in variables
○ heteroskedasticity

KTL_001_C12_10: Instrumental Variables regression uses instruments to
○ establish the Mozart Effect
○ to increase the regression R2
○ to eliminate serial correlation
● isolate movements in X that are uncorrelated with u

Chapter 13: Experiments and Quasi-Experiments

KTL_001_C13_1: In the context of a controlled experiment, consider the simple linear regression formulation Yi=β0+β1Xi+ui. Let the Yi be the outcome, Xi the treatment level when the treatment is binary, and ui contain all the additional determinants of the outcome. Then calling ^β1 a differences estimator
● makes sense since it is the difference between the sample average outcome of the treatment group and the sample average outcome of the control group.
○ and ^β0 the level estimator is standard terminology in randomized controlled experiments.
○ does not make sense, since neither Y nor X are in differences.
○ is not quite accurate since it is actually the derivative of Y on X.

KTL_001_C13_2: The following is not a threat to external validity:
○ the experimental sample is not representative of the population of interest.
○ the treatment being studied is not representative of the treatment that would be implemented more broadly.
○ experimental participants are volunteers.
● partial compliance with the treatment protocol.

KTL_001_C13_3: Experimental data are often
○ observational data.
○ binary data, in that the subject either does or does not respond to the treatment.
● panel data.
○ time series data.

KTL_001_C13_4: The following estimation methods should not be used to test for randomization when Xi, is binary:
● linear probability model (OLS) with homoskedasticity-only standard errors.
○ probit.
○ logit.
○ linear probability model (OLS) with heteroskedasticity-robust standard errors.

KTL_001_C13_5: Quasi-experiments
● provide a bridge between the econometric analysis of observational data sets and the statistical ideal of a true randomized controlled experiment.
○ are not the same as experiments, and lessons learned from the use of the latter can therefore not be applied to them.
○ most often use difference-in-difference estimators, which are quite different from OLS and instrumental variables methods studied in earlier chapters of the book.
○ use the same methods as studied in earlier chapters of the book, and hence the interpretation of these methods is the same.

KTL_001_C13_6: Testing for the random receipt of treatment
○ is not possible, in general
○ entails testing the hypothesis that the coefficients on W1i, …, Wri are non-zero in a regression of Xi on W1i, …, Wr
○ is not meaningful since the LHS variable is binary
● entails testing the hypothesis that the coefficients on W1i, …, Wri are zero in a regression of Xi on W1i, …, Wr

KTL_001_C13_7: Small sample sizes in an experiment
○ biases the estimators of the causal effect
● may pose a problem because the assumption that errors are normally distributed is dubious for experimental data
○ do not raise threats to the validity of confidence intervals as long as heteroskedasticityrobust standard errors are used
○ may affect confidence intervals but not hypothesis tests

KTL_001_C13_8: A repeated cross-sectional data set is
● a collection of cross-sectional data sets, where each cross-sectional data set corresponds to a different time period
○ the same as a balanced panel data set
○ what Card and Krueger used in their study of the effect of minimum wages on teenage employment
○ time series

KTL_001_C13_9: In a sharp regression discontinuity design,
○ crossing the threshold influences receipt of the treatment but is not the sole determinant
○ the population regression line must be linear above and below the threshold
○ Xi will in general be correlated with ui
● receipt of treatment is entirely determined by whether W exceeds the threshold

KTL_001_C13_10: Threats to internal validity of quasi-experiments include
○ failure of randomization
○ failure to follow the treatment protocol
○ attrition
● all of the above with some modifications from true randomized controlled experiments

Chapter 14: Introduction to Time Series Regression and Forecasting

KTL_001_C14_1: Pseudo out of sample forecasting can be used for the following reasons with the exception of

○ giving the forecaster a sense of how well the model forecasts at the end of the sample.
○ estimating the RMSFE.
● analyzing whether or not a time series contains a unit root.
○ evaluating the relative forecasting performance of two or more forecasting models.

KTL_001_C14_2: One reason for computing the logarithms (ln), or changes in logarithms, of economic time series is that

○ numbers often get very large.
○ economic variables are hardly ever negative.
● they often exhibit growth that is approximately exponential.
○ natural logarithms are easier to work with than base 10 logarithms.

KTL_001_C14_3: The AR(p) model

$Q$ is defined as $Y_{i}=\beta_{0}+\beta_{p} Y_{t p}+u_{i}$ <ul>  	<li>represents $\mathrm{Yt}$ as a linear function of $\mathrm{p}$ of its lagged values. Q can be represented as follows: $Y_{i}=\beta_{0}+\beta_{1} X_{t}+\beta_{p} Y_{t p}+u_{i}$ $Q$ can be written as $Y_{i}=\beta_{0}+\beta_{1} Y_{t-1}+u_{t p}$

KTL_001_C14_4: The Granger Causality Test

● uses the F-statistic to test the hypothesis that certain regressors have no predictive content for the dependent variable beyond that contained in the other regressors.
○ establishes the direction of causality (as used in common parlance) between X and Y in addition to correlation.
○ is a rather complicated test for statistical independence.
○ is a special case of the Augmented Dickey-Fuller test.

KTL_001_C14_5: You should use the QLR test for breaks in the regression coefficients, when

○ the Chow F-test has a p value of between 0.05 and 0.10.
● the suspected break data is not known.
○ there are breaks in only some, but not all, of the regression coefficients.
○ the suspected break data is known.

KTL_001_C14_6: The Bayes-Schwarz Information Criterion (BIC) is given by the following formula

○ commonly used to test for serial correlation
○ only used in cross-sectional analysis
○ developed by the Bank of England in its river of blood analysis
● used to help the researcher choose the number of lags in an autoregression

KTL_001_C14_9: The AIC is a statistic

○ that is used as an alternative to the BIC when the sample size is small (T < 50)
○ often used to test for heteroskedasticity
● used to help a researcher chose the number of lags in a time series with multiple predictors
○ all of the above

KTL_001_C14_10: The formulae for the AIC and the BIC are different. The

○ AIC is preferred because it is easier to calculate
● BIC is preferred because it is a consistent estimator of the lag length
○ difference is irrelevant in practice since both information criteria lead to the same conclusion
○ AIC will typically underestimate p with non-zero probability

Chapter 15: Estimation of Dynamic Causal Effects

KTL_001_C15_1: Ascertaining whether or not a regressor is strictly exogenous or exogenous ultimately requires all of the following with the exception of

○ economic theory.
○ institutional knowledge.
○ expert judgment.
● use of HAC standard errors.

KTL_001_C15_2: The impact effect is the

● zero period dynamic multiplier.
○ h period dynamic multiplier, h>0.
○ cumulative dynamic multiplier.
○ long-run cumulative dynamic multiplier.

KTL_001_C15_3: Quasi differences in Yt are defined as

\begin{aligned} &\bigcirc Y_{t}-Y_{t-1}\ &\text { - } Y_{t}-\phi_{1} Y_{t-1}\ &\bigcirc \Delta Y_{t}-\phi_{1} Y_{t-1}\ &O \phi_{1}\left(Y_{t}-Y_{t-1}\right) \end{aligned}

KTL_001_C15_4: GLS involves

○ writing the model in differences and estimating it by OLS, using HAC standard errors.
○ truncating the sample at both ends of the period, then estimating by OLS using HAC standard errors.
○ checking the AIC rather than the BIC in choosing the maximum lag-length of the regressors.
● transforming the regression model so that the errors are homoskedastic and serially uncorrelated, and then estimating the transformed regression model by OLS.

KTL_001_C15_5: HAC standard errors should be used because

○ they are convenient simplifications of the heteroskedasticity-robust standard errors.
● conventional standard errors may result in misleading inference.
○ they are easier to calculate than the heteroskedasticity-robust standard errors and yet still allow you to perform inference correctly.
○ when there is a structural break, then conventional standard errors result in misleading inference.

KTL_001_C15_6: The interpretation of the coefficients in a distributed lag regression as causal dynamic effects hinges on

● the assumption that X is exogenous
○ not having more than four lags when using quarterly data
○ using GLS rather than OLS
○ the use of monthly rather than annual data

KTL_001_C15_7: Given the relationship between the two variables, the following is most likely to be exogenous:

○ the inflation rate and the short term interest rate: short-term interest rate is exogenous
○ U.S. rate of inflation and increases in oil prices: oil prices are exogenous
● Australian exports and U.S. aggregate income: U.S. aggregate income is exogenous
○ change in inflation, lagged changes of inflation, and lags of unemployment: lags of unemployment are exogenous

KTL_001_C15_8: When Xt is strictly exogenous, the following estimator(s) of dynamic causal effects are available:

○ estimating an ADL model and calculating the dynamic multipliers from the estimated ADL coefficients
○ using GLS to estimate the coefficients of the distributed lag model
○ (a) but not (b)
● (a) and (b)

KTL_001_C15_9: In time series data, it is useful to think of a randomized controlled experiment

● consisting of the same subject being given different treatments at different points in time
○ consisting of different subjects being given the same treatment at the same point in time
○ as being non-existent (“parallel universes” only exist in science fiction)
○ consisting of the at least two subjects being given different treatments at the same point in time

KTL_001_C15_10: Consider the distributed lag model

$Y_{t}=\beta_{0}+\beta_{1} X_{t}+\beta_{2} X_{t-1}+\beta_{3} X_{t-2}+\ldots+\beta_{r+1} X_{t-r}+u_{t}$. The dynamic causal effect is ० $\beta_{0}+\beta_{1}$ <ul>  	<li>$\beta_{1}+\beta_{2}+\ldots+\beta_{r+1}$ O $\beta_{0}+\beta_{1}+\beta_{2}+\ldots+\beta_{r+1}$ $\mathrm{O} \beta_{1}$

Chapter 16: Additional Topics in Time Series Regression

KTL_001_C16_1: If Yt is I(2), then

$\bullet \Delta^{2} Y_{t}$ is stationary. Q Yt has a unit autoregressive root. Q $\Delta Y_{t}$ is stationary. $O Y t$ is stationary.

KTL_001_C16_2: A VAR with five variables, 4 lags and constant terms for each equation will have a total of

○ 21 coefficients.
○ 100 coefficients.
● 105 coefficients.
○ 84 coefficients.

KTL_001_C16_3: The order of integration

○ can never be zero.
● is the number of times that the series needs to be differenced for it to be stationary.

O \text { is the value of } \phi_{1} \text { in the quasi difference } \Delta Y_{t}-\phi_{1} Y_{t-1}

○ depends on the number of lags in the VAR specification.

KTL_001_C16_4: The following is not an appropriate way to tell whether two variables are cointegrated

● see if the two variables are integrated of the same order.
○ graph the series and see whether they appear to have a common stochastic trend.
○ perform statistical tests for cointegration.
○ use expert knowledge and economic theory.

KTL_001_C16_5: ARCH and GARCH models are estimated using the

○ OLS estimation method.
● the method of maximum likelihood.
○ DOLS estimation method.
○ VAR specification.

KTL_001_C16_6: A VAR with k time series variables consists of

● k equations, one for each of the variables, where the regressors in all equations are lagged values of all the variables
○ a single equation, where the regressors are lagged values of all the variables
○ k equations, one for each of the variables, where the regressors in all equations are never more than one lag of all the variables
○ k equations, one for each of the variables, where the regressors in all equations are current values of all the variables

KTL_001_C16_7: The BIC for the VAR is

\begin{aligned} &\circ B I C(p)=\ln \left[\operatorname{det}\left(\hat{\Sigma}<em>{u}\right)\right]+k(k p+1) \frac{2}{T} \ &O B I C(p)=\ln \left[\operatorname{det}\left(\hat{\Sigma}</em>{u}\right)\right]+k(p+1) \frac{\ln (T)}{T} \ &\bullet B I C(p)=\ln \left[\operatorname{det}\left(\hat{\Sigma}_{u}\right)\right]+k(k p+1) \frac{\ln (T)}{T} \ &O B I C(p)=\ln [S S R(p)]+k(k p+1) \frac{\ln (T)}{T} \end{aligned}

KTL_001_C16_8: The lag length in a VAR using the BIC proceeds as follows: Among a set of candidate values of p, the estimated lag length ^pp^ is the value of p

○ For which the BIC exceeds the AIC
○ That maximizes BIC(p)
○ Cannot be determined here since a VAR is a system of equations, not a single one
● That minimizes BIC(p)

KTL_001_C16_9: The dynamic OLS (DOLS) estimator of the cointegrating coefficient, if Yt and Xt are cointegrated,

○ is efficient in large samples
○ statistical inference about the cointegrating coefficient is valid
○ the t-statistic constructed using the DOLS estimator with HAC standard errors has a standard normal distribution in large samples
● all of the above

KTL_001_C16_10: The EG-ADF test

○ Is the similar to the DF-GLS test
● Is a test for cointegration
○ Has as a limitation that it can only test if two variables, but not more than two, are cointegrated
○ Uses the ADF in the second step of its procedure

Chapter 17: The Theory of Linear Regression with One Regressor

KTL_001_C17_1: When the errors are heteroskedastic, then

● WLS is efficient in large samples, if the functional form of the heteroskedasticity is known.
○ OLS is biased.
○ OLS is still efficient as long as there is no serial correlation in the error terms.
○ weighted least squares is efficient.

KTL_001_C17_3: Asymptotic distribution theory is

○ not practically relevant, because we never have an infinite number of observations.
○ only of theoretical interest.
○ of interest because it tells you what the distribution approximately looks like in small samples.
● the distribution of statistics when the sample size is very large.

KTL_001_C17_3: Under the five extended least squares assumptions, the homoskedasticity-only tdistribution in this chapter

● has a Student t distribution with n-2 degrees of freedom.
○ has a normal distribution.

\text { Q converges in distribution to a } \chi_{n-2}^{2} \text { distribution. }

○ has a Student t distribution with n degrees of freedom.

KTL_001_C17_4: If the errors are heteroskedastic, then

○ the OLS estimator is still BLUE as long as the regressors are nonrandom.
○ the usual formula cannot be used for the OLS estimator.
○ your model becomes overidentified.
● the OLS estimator is not BLUE.

KTL_001_C17_5: The advantage of using heteroskedasticity-robust standard errors is that

○ that they are easier to compute than the homoskedasticity-only standard errors.
● they produce asymptotically valid inferences even if you do not know the form of the conditional variance function.
○ it makes the OLS estimator BLUE, even in the presence of heteroskedasticity.
○ they do not unnecessarily complicate matters, since in real-world applications, the functional form of the conditional variance can easily be found.

KTL_001_C17_6: In order to use the t-statistic for hypothesis testing and constructing a 95% confidence interval as +- 1.96 standard errors, the following three assumptions have to hold:

● the conditional mean of ui, given Xi is zero; (Xi,Yi), i = 1,2, …, n are i.i.d. draws from their joint distribution; Xi and ui have four moments
○ the conditional mean of ui, given Xi is zero; (Xi,Yi), i = 1,2, …, n are i.i.d. draws from their joint distribution; homoskedasticity
○ the conditional mean of ui, given Xi is zero; (Xi,Yi), i = 1,2, …, n are i.i.d. draws from their joint distribution; the conditional distribution of ui given Xi is normal
○ none of the above

KTL_001_C17_7: If the functional form of the conditional variance function is incorrect, then

● the standard errors computed by WLS regression routines are invalid
○ the OLS estimator is biased
○ instrumental variable techniques have to be used

$Q$ the regression $R^{2}$ can no longer be computed KTL_001_C17_8: Suppose that the conditional variance is $\operatorname{var}\left(u_{i} \mid X_{i}\right)=\lambda h\left(X_{i}\right)$ where $\lambda$ is a constant and $h$

is a known function. The WLS estimator is

○ the same as the OLS estimator since the function is known
○ can only be calculated if you have at least 100 observations
● the estimator obtained by first dividing the dependent variable and regressor by the square root of hand then regressing this modified dependent variable on the modified regressor using OLS
○ the estimator obtained by first dividing the dependent variable and regressor by h and then regressing this modified dependent variable on the modified regressor using OLS

KTL_001_C17_9: The large-sample distribution of $\hat{\beta}<em>{1}$ is $\bullet \sqrt{n}\left(\hat{\beta}</em>{1}-\beta_{1}\right) \rightarrow N\left[0, \frac{\operatorname{var}\left(\nu_{1}\right)}{\left[\operatorname{var}\left(X_{1}\right)\right]^{2}}\right] \quad$ where $\quad \nu_{i}=\left(X_{i}-\mu_{X}\right) u_{i}$ $\sqrt{n}\left(\hat{\beta}<em>{1}-\beta</em>{1}\right) \rightarrow N\left[0, \frac{\operatorname{var}\left(\nu_{1}\right)}{\left[\operatorname{var}\left(X_{i}\right)\right]^{2}}\right] \quad$ where $\quad \nu_{i}=u_{i}$ $\sqrt{n}\left(\hat{\beta}<em>{1}-\beta</em>{1}\right) \rightarrow N\left[0, \frac{\operatorname{var}\left(\nu_{i}\right)}{\left[\operatorname{var}\left(X_{i}\right)\right]^{2}}\right] \quad$ where $\quad \nu_{i}=X_{i} u_{i}$ $\bigcirc \sqrt{n}\left(\hat{\beta}<em>{1}-\beta</em>{1}\right) \rightarrow N\left[0, \frac{\sigma_{u}^{2}}{\left[\operatorname{var}\left(X_{i}\right)\right]^{2}}\right]$ KTL_001_C17_10: Assume that $\operatorname{var}\left(u_{i} \mid X_{i}\right)=\phi_{0}+\phi_{1} X_{i}^{2} .$ One way to estimate $\phi_{0}$ and $\phi_{1}$ consistently is to regress $\mathrm{Q} \hat{u}<em>{i}$ on $X</em>{i}^{2}$ using OLS $\bullet \hat{u}<em>{i}^{2}$ on $X</em>{i}^{2}$ using OLS $Q \hat{u}<em>{i}^{2}$ on $\sqrt{X</em>{i}}$ using OLS $Q \hat{u}<em>{i}^{2}$ on $X</em>{i}^{2}$ using OLS, but suppressing the constant ("restricted least squares")

Chapter 18: The Theory of Multiple Regression

KTL_001_C18_1: The multiple regression model can be written in matrix form as follows:

○ Y=Xβ
○ Y=X+U
○ Y=βX+U
● Y=Xβ+U

KTL_001_C18_2: One of the properties of the OLS estimator is

० $X \hat{\beta}=\mathrm{O}_{k+1}$ O that the coefficient vector $\hat{\beta}$ has full rank. <ul>  	<li>$X^{\prime}(Y-X \hat{\beta})=\mathrm{O}_{k+1}$ $O\left(X^{\prime} X\right)^{-1}=X^{\prime} Y$ KTL_001_C18_3: $\hat{\beta}-\beta$ Q cannot be calculated since the population parameter is unknown. $\bullet=\left(X^{\prime} X\right)^{-1} X^{\prime} U$ $\mathrm{O}=Y-\hat{Y}$ $\mathrm{O}=\beta+\left(X^{\prime} X\right)^{-1} X^{\prime} U$ KTL_001_C18_4: The formulation $R \beta=r$ to test a hypotheses

● allows for restrictions involving both multiple regression coefficients and single regression coefficients.
○ is F-distributed in large samples.
○ allows only for restrictions involving multiple regression coefficients.
○ allows for testing linear as well as nonlinear hypotheses.

KTL_001_C18_5: The GLS estimator

○ is always the more efficient estimator when compared to OLS.
● is the OLS estimator of the coefficients in a transformed model, where the errors of the transformed model satisfy the Gauss-Markov conditions.
○ cannot handle binary variables, since some of the transformations require division by one of the regressors.
○ produces identical estimates for the coefficients, but different standard errors.

KTL_001_C18_6: The extended least squares assumptions in the multiple regression model include four assumptions from Chapter 6 (ui has conditional mean zero; (Xi,Yi), i = 1,…, n are i.i.d. draws from their joint distribution; Xi and ui have nonzero finite fourth moments; there is no perfect multicollinearity). In addition, there are two further assumptions, one of which is

○ heteroskedasticity of the error term.
○ serial correlation of the error term.
● The conditional distribution of ui given Xi is normal.
○ invertibility of the matrix of regressors.

KTL_001_C18_7: The OLS estimator for the multiple regression model in matrix form is

\begin{aligned} &\text { - }\left(X^{\prime} X\right)^{-1} X^{\prime} Y\ &\text { ॰ } X\left(X^{\prime} X\right)^{-1} X^{\prime}-P_{X}\ &\bigcirc\left(X^{\prime} X\right)^{-1} X^{\prime} U\ &O\left(X^{\prime} \Omega^{-1} X\right)^{-1}\left(X^{\prime} \Omega^{-1} Y\right) \end{aligned}

KTL_001_C18_8: To prove that the OLS estimator is BLUE requires the following assumption

○ (Xi,Yi) i = 1, …, n are i.i.d. draws from their joint distribution
○ Xi and ui have nonzero finite fourth moments
○ The conditional distribution of ui given Xi is normal
● None of the above

KTL_001_C18_9: The TSLS estimator is

\begin{aligned} &O\left(X^{\prime} X\right)^{-1} X^{\prime} Y \ &\text { - }\left[\left(X^{\prime} Z\right)\left(Z^{\prime} Z\right)^{-1}\left(Z^{\prime} X\right)\right]^{-1} X^{\prime} Z\left(Z^{\prime} Z\right)^{-1} Z^{\prime} Y \ &O\left(X^{\prime} \Omega^{-1} X\right)^{-1}\left(X^{\prime} \Omega^{-1} Y\right) \ &O\left(X^{\prime} P_{Z}\right)^{-1} P_{Z} Y \end{aligned}

KTL_001_C18_10: The homoskedasticity-only F-statistic is


Hãy bình luận đầu tiên

Để lại một phản hồi

Thư điện tử của bạn sẽ không được hiện thị công khai.