preload

Learning Adventure 5

Making Data and Making Sense of Data


Executive Summary:

1. The problem:

Part 1: Create one or more data sets that addresses questions related to previous U.S. Presidential Elections and save it as a .csv or tab-delimited .txt file and share it with the Cadre.

Part 2: Install and learning to use InspireData. Import your data set to ask and answer questions of your data set.

Part 3: Merge data sets from other Cadre members to create a combined data set that can be interrogated using InspireData. Ensure the data set is as complete as possible and include any additional notes or embellishments to improve the data and make it more useful for others.

2. Output:

Download a trial of InspireData to review the data in that tool: http://cf.inspiration.com/freetrial/index.cfm?fuseaction=inspiredata_qual_form
Final data set in InspireData format (Right-click for save options.)
Final data set in Excel format (Right-click for save options.)
Final data set in .csv format (Right-click for save options.)

3. Discussion of relevance/meaning:

The meaning I took from this Learning Adventure (LA) was the value in being able to collect and analyze data to answer important questions. The timing of this LA was particularly relevant based on its proximity to the 2008 U.S. Presidential Election. This fact added a level of engagement and interest in the data analysis as it tied so directly to current issues and events.

As I continue to work on my Action Research Project (ARP) this LA demonstrates the value in collecting and maintaining appropriate data to support the work which I am doing. Without this data it is difficult to communicate the impact or significance of my efforts to others. I was struck by the way that getting close to a specific segment of data related to the elections shined a light on aspect of our Presidential history of which I was not aware. As I worked to interrogate the data I continued to gain more knowledge of our election history and I began to ask more questions, some which could be answered by the data, and some that required outside research, about the elections.

The challenge of merging other cadre members’ data sets into my set allowed me to see the value in creating a very complete and relevant data set. In this way it is most likely to be of use to others and to align with the work of others. With the highly varied approaches that were taken to the project it was somewhat difficult to find another data set that would naturally blend with my own set. Part of this was due to the macro view of the elections that I took, versus the specific view that many took. In the end I was able to combine data by making adjustments to my own and the other cadre member’s data so that they would merge and provide meaningful information.

Another highly essential learning for me from this LA was that children should engage with these types of data analysis tools early in their learning career. It was fascinating that the InspireData software, while designed for use by children, provided access to sophisticated views into data sets. I certainly learned a great deal from working with the data in this tool and personally found it more rewarding than the work I have done with Excel, or similar spreadsheet tools, to graph and analyze data.

4. Possible conclusions/solution:

My conclusion of this LA was that it reminded me of the value of knowing how to find, organize, and analyze data that is relevant to your efforts. This is an essential piece of the research and communication of important information if it is to be taken seriously by others. Without the data to back up your claims and a way to demonstrate it that relates to others it is difficult to have others recognize and appreciate the value of your work or argument.

While tool selection is not as critical, knowing how to use the selected tool to appropriately analyze and represent the data in question is necessary. One can collect the most meaningful data in the world and still fail to communicate the importance of the information if the data is not cleanly presented and organized, and visually presented in a meaningful way. The process of collecting, cleaning, and sharing data in this LA focused in on the challenge in presenting complete and accurate data that can relate to the work of others to magnify the meaning of both.

The process of learning and working with the InspireData application showed me how critical it is to have the data clean and formatted appropriately for it to be useful. I had some initial trouble in importing my comma-separated .txt file based on a character that was not supported by the tool. I have worked with many import/exports of various data formats so I was able to fairly quickly locate the issue but it reminded me of how important it is that data be kept clean and formatted in a way that will work when it is applied to various tools. This was also true in terms of selecting appropriate units of measure, etc. for various data fields, such as height and weight. Having a height field reflecting data such as “5 foot 10 inches” is not valuable for analysis in comparison to “70” and a description included that the field is in inches.

Overall I have seen that it is important to remove yourself from your data and ask questions such as the one’s posed in this assignment, “Is your data set complete enough for users to interrogate the database?” and “Would your work matter to others?” These and other questions assist in ensuring a complete and relevant set of data that speaks to others about the importance of your work and allows them to ask and answer their own questions.

5. Supporting evidence – may include links, graphics, references, supportive arguments:

References:

The New York Times (online), “The Measure of a President” October 6, 2008: http://www.nytimes.com/interactive/2008/10/06/opinion/06opchart.html
Dave Leip’s “Atlas of U.S. Presidential Elections”: http://uselectionatlas.org/
Wikipedia: http://en.wikipedia.org/wiki/Heights_of_United_States_Presidents_and_presidential_candidates

Reflection on the Process:

Reflection posted in the discussion forum of Part One of the LA:

I used the site that Greg mentioned (http://uselectionatlas.org/RESULTS/) to create a data set of all the core election statistics from 1789 through 2004, including: Presidential Candidate, Vice Presidential Candidate, Political Party, Popular Vote, Electoral Vote, and House Vote. This data also includes the home state of the candidates when provided within the site’s data.

It was interesting to look through so much of the data from each election, even at a fairly high-level while putting together this data set. I was intrigued by the number and variety of political parties over the years that have shown up on the ballot – Prohibition, Farmer-Laborer, Progressive, Socialist, Communist, and many more. Reading through them while collecting the data brought back vague memories of history classes from the far past. I also found it interesting that several elections included multiple Vice Presidential candidates aligned to one Presidential candidate.

I am looking forward to researching these areas and any more that I can based on the information included in the data set.

Reflection posted in the discussion forum of Part Two of the LA:

Donna and I had a great time today working with the data and trying to get to answer various questions. As Donna mentioned in her posting below, assigning labels and colors was very easy and I could see how even very young children could use a program like this to explore data analysis.

Along with the overall ease of use, I found this program fairly robust in terms of the possibilities in graphing and analyzing data. The ability to apply custom formulas takes this app to a higher level of usefulness and extensibility. There is certainly plenty of room for me to continue learning through the use of this tool.

This evening I was able to do some data scrubbing and combining to pull together the data Anne collected with my own and create a new database to answer additional questions. Along with the information that Anne put together related to the height and weight of Presidential candidates I was able to begin analyzing height/weight comparisons by political party and include voting results. It was great to see how easy it is to copy/paste – particularly once the data is prepped. In this case I used Excel a bit to do some find/replace and other tweaks to ensure that the two data sets merged successfully.

As I reflect on my learning process I see that while I was able to find some data and get started with the program for me the enjoyment of learning came from interacting with others in the process. When I wasn’t working with Donna (thanks for great discussions Donna!), I was often pointing out interesting facts from the data to my wife. Reflecting with others really enhances my learning process.

Socially constructed learning is clearly the choice for me as I find that I gain energy when I have a chance to engage with others in a discussion about a topic. Synchronous is something I still prefer, but I also enjoy asynchronous opportunity that raise questions to be answered and offer suggestions for other steps to try. Both of these methods have been prevalent in this learning adventure.

From Dewey, “The principle that development of experience comes about through interaction means that education is essentially a social process.” (58)

It also clear to me that I enjoy the hands-on aspect of the learning adventures. If I was simply reading about the Presidential Election statistics I am sure I would not be as engaged in what I am learning from analyzing the data on my own using InspireData. This tool allows me to see dynamically the way that the data changes as fields, formulas, and other constraints are added, removed, or adjusted. What it ultimately allowed is for me to do something of interest to me personally with the data, which, as Papert points out in describing how Brian and Henry engaged with Logo, adds a sense of power to the learning:

“Beyond developing technical mathematical skills, they came to experience mathematics in a very different way. It became something to be used purposefully; they felt it as a source of power in pursuing important and deeply personal projects. I am not sure that people who have not experienced mathematics in this way can fully appreciate how heady, how powerful, it can be.” (47)

Reflection posted in the discussion forum of Part Three of the LA:

Data Dabbling…

The final data set that I posted on the Wiki tonight is fairly complete and flexible. It would allow users to answer many questions from all of the past Presidential Elections.

I have updated and further ‘cleaned’ the combined data file I made from bringing Anne’s Presidential height/weight comparison data into my data set containing voting results and other election information, such as Vice Presidential candidates and political parties.

I was able to locate almost all of the Presidential candidate heights back to 1789, allowing me to expand the data set from my original combined file. I also found a few additional weights, but not very many – although my search was not necessarily exhaustive.

I added the following notes to the file to help explain the data and add some key information:

Presidential heights are represented in inches, weight is in pounds.
In 1800 a “House Vote” was required to decide the election which went in favor of Thomas Jefferson over Aaron Burr by a vote of 10 to 4.
In 1824 a “House Vote” was required to decide the election which went in favor of John Quincy Adams over Andrew Jackson by a vote of 13 to 7.
Prior to 1896 very few Presidential Candidate weights are available; some heights are also not included.

References:

The New York Times (online), “The Measure of a President” October 6, 2008: http://www.nytimes.com/interactive/2008/10/06/opinion/06opchart.html
Dave Leip’s “Atlas of U.S. Presidential Elections”: http://uselectionatlas.org/
Wikipedia: http://en.wikipedia.org/wiki/Heights_of_United_States_Presidents_and_presidential_candidates

I am sure this data file could be further embellished. My original file had many of the Presidential Candidates’ home states included as well as the home states of the Vice Presidential candidates. I removed this information as the source I was using did not include it for many of the elections. If someone had the time to do more searching and updating of the file this would be a nice addition.

Another thing that could be added to the file would be some pre-made charts and graphs. I made charts and graphs with other versions of the file to test the process and learn from my data, but left them off of this file so that others could start from scratch. I could see including two or three pre-made ones based on some of the top information included. That might spark interest in and show the value of the data.

Self-Assessment reflection posted in the discussion forum:

I think the combined database that I put together would be of value to school children. They can use the information that Anne collected to compare the heights and weights of a large number of the Presidential candidates (prior to 1896 weight information is not readily available). They may also use the details I included to delve into the specifics related to the popular and electoral votes and political parties for the top two candidates involved in each election since 1789.

The original full database that I created may also be of value to others, although I would like to spend more time cleaning that one should I be asked to submit it for publishing. This would include locating a little more information, adding notes, and perhaps removing two fields.

I am happy with what I was able to accomplish with this learning adventure. It would be great to see others enjoy using the data.

  Site design: Pagelines   Powered by: WordPress

  Content © Daniel J. Wood