Monday, November 18, 2013

Lecture Notes for Polling

Polling
Biggest Polling Fails in (US) History

Polling is a subset of generalizations so many of the rules for evaluation and analysis will be the same as in the previous section for generalizations.  Polling is a generalization about a specified population's beliefs or attitudes.  For example, during election campaigns, the populations in important "battleground" states are usually polled to find out what issues are important to them.  Upon hearing the results, the candidate will then remove what's left of his own spine and say whatever that population wants to hear.  (Meh!  Call me a cynic...)

Suppose I were to conduct a poll of UNLV students to determine their primary motivation for attending university.  To begin the evaluation of the poll we'd need to know 3 things:

(a)  The sample:  Who is in the sample (representativeness) and how big was that sample.
(b)  The population: What is the group I'm trying to make the generalization about.
(c)  The property in question:  What is that belief, attitude, or value I'm trying to attribute to the population.

Recall from the previous section that generalizations can be interpreted as having an (implicit or explicit) argument form.   Lets instantiate this argument structure with a hypothetical poll.  Suppose I want to poll UNLV students with the question, "should critical thinking 102 be a graduation requirement?"  Because I have finite time and energy I can't ask each student at the university.  Instead I'll take a sample and extrapolate from that.  My sample will be students in my class.

P1.  A sample of 36 students from my class is a representative sample of the general student population.
P2.  65% of the students in my class (i.e., the sample) said they agree that critical thinking 102 should be a graduation requirement.
C.  Therefore, we can conclude that around 65% of UNLV students think that critical thinking 102 should be a graduation requirement.

The are 2 broad categories of analysis we can apply to the poll results:

Sampling Errors
Questions about sampling errors apply to P1, which are basically: (a) is the sample size large enough to be representative of the group and (b) does the sample avoid any biases (i.e., does it avoid under or over representing one group over another in a way that isn't reflective of the general population).

Regarding sample size, national polls generally require a (representative) sample size of 1000, so we should expect that a poll about the UNLV population could be quite a bit less than that.  Aside from that, (a) is self explanatory and I've discussed it above, so lets look a little more closely at (b).

The question here is whether the students in my class accurately represent all important subgroups in the student population.  For example, is the sample representative of UNLVs general populations ratio of ethic groups, socio-economic groups, and majors?  You might find that there are other important subgroups that should be captured in a sample depending on the content of the poll.

Someone might plausibly argue that the sample isn't representative because it disproportionately represents students in their 1st and 2nd years.

We can ask a further question about how the group was chosen.  For example, if I make filling out the survey voluntary then there's a possibility of bias.  Why? Because it's possible that people who volunteer for such a survey have a strong opinion one way or another.  This means that the poll will capture only those with strong opinions (or  those who just generally like to give their opinion) but leave out the Joe-Schmo population who might not have strong feelings or might be too busy facebooking on their got-tam phone to bother to do the survey.

In order to protect again such sampling errors polls should engage in random sampling.  That means no matter what sub-group someone is in, they have an equal probability of being selected to do the survey. We can also take things to a whole.  nuva.  level.  when we use stratified sampling.  With stratified sampling we make sure a representative proportion of each subgroup is contained in the general sample.   For example, if I know that about 30% of students are 1st year students then I'll make sure that 30% of my sample randomly samples 1st year students.

Another thing to consider in sampling bias is margin of error.  The margin of error (e.g. +/-5%) measures the likelihood that the data collected is dependable.  Margin of error is important to consider when there is a small difference between competing results.  For example, suppose a survey says 46% of students think Ami should be burned at the stake while 50% say Ami should be hailed as the next messiah.  One might think this clearly shows Ami's well on his way to establishing a new religion but we'd be jumping the gun until we looked at the poll's margin of error.

Suppose the margin of error is +/- 5%.  This means that those that want to burn Ami at the stake could actually be up to 48.3% ((46x.05)+46) and those that want to make him the head of a new religion could be as low as 47.5% ((50x.05)+50).  Ami might have to wait a few more years for world domination.

As I mentioned in the beginning of this section, questions about sampling error are all directed at P1; i.e., is the sample representative of the general population about which the general claim will be made.  Next we will look at measurement errors which have to do with the second premise (i..e., that the people in the sample actually do have the believes/attitudes/properties attributed to them in the survey).

Measurement Errors
Measurement errors have to do with scrutinizing the claim that the sample population actually has the believes/attitudes/properties attributed to them in the survey.  Evaluating polls for measurement errors generally has to do with how the information was asked/collected, how the questions were worded, and the environmental conditions at the time of the poll.

As a starting point, when we are looking at polls that are about political issues, we should generally be skeptical of results--especially when polling agencies that are tied to a political party or ideologies produce competing poll results that conform with their respective positions.  In short, we should be alert to who is conducting the poll and consider whether there may be any biases.

One specific type of measurement error arises out of semantic ambiguity or vagueness. For example, suppose a survey asks if you drive "frequently".  This is a subjective term and could be interpreted differently.  For some people it might mean 1x a week, for others once a day.  A measurement error will be introduced into the data unless this vagueness is cleared up.  Because more people probably think of "frequent drinking" as being "more than what I personally drink", the results will be artificially low.  They also will not very meaningful because the responses don't mean the same thing.

Another type of measurement error arises when we consider the medium by which the questions are asked.  Psychology tells us that people are more likely to tell the truth when asked questions face to face and less so when asked over the phone.  Even less so when asked in groups  (groupthink).  These considerations will introduce measurement errors; that is, they will cast doubt on whether the members of the sample actually have the quality/view/belief being attributed to them.

When evaluation measurement accuracy we should also consider when and where the poll took place.  For example, if, during exam periode, students are asked whether they think school is stressful (generally), probably more will answer in the affirmative than if they are asked during the 1st week of the semester.  

Also, going back to our poll of students concerning the having critical thinking as a graduation requirement, we might argue that the timing is influencing the results.  The sample is taken from students currently taking the class.  Perhaps it's too early in their career to appreciate the course's value; yet if we asked students who had already taken the course and have had a chance to enjoy the glorious fruits of the class, the results might be different.

Finally, we should be alert to how second-hand reporting of polls can present the results in a distorted way.  Newspapers and media outlets want eyeballs, so they might over-emphasize certain aspects of the poll or interpret the results in a way that sensationalizes them.  In short, we should approach with a grain of salt polls that are reported second-hand.

To summarize:  For polling we want to evaluate (1) do the individuals in the sample actually have the values/attitudes/beliefs being attributed to them;  (2) is the sample free of (a) sampling errors and (b) sampling measurement errors.

No comments:

Post a Comment