**
POLS 6386 MEASUREMENT THEORY
Eleventh Assignment
Due 15 April 2003**

- In this problem we are going to use the Multidimensional Scaling
routine that is part of the
**pcurve**package in**R**(a bow to Noah Kaplan for pointing this out to me!). Download the example**R**program and data files below that show how it works with the**Color Circle**data:

R Program to do Kruskal MDS (mdscal_color_1.r)

A listing of**mdscal_color_1.r**is below:

Below is the graph that the program generates:**# # # mdscal_color_1.r -- Tests R Version of Kruskal MDS # # This needs MASS and pcurve packages # # The Data Must Be Transformed to Squared Distances Below # library(MASS) library(pcurve) # T <- matrix(scan("D:/R_Files/colors.txt",0),ncol=14,byrow=TRUE) colornames <- read.csv("D:/R_Files/color_coords.txt",header=T,row.names=1) attach(colornames) TT <- T nrow <- length(T[,1]) ncol <- length(T[1,]) xrow <- NULL xcol <- NULL matrixmean <- 0 # # Transform the Matrix into Squared Distances The R MDS program assumes that # the data are distances i <- 0 while (i < nrow) { i <- i + 1 j <- 0 while (j < ncol) { j <- j + 1 TT[i,j] <- ((100 - T[i,j])/50)**2 } } T <- TT # # Kruskal MDS Routine # # T -- Input # dim=2 -- number of dimensions # maxit = number of iterations # The program returns the configuration in points and the Stress in stress: # for example, colormds$points and colormds$stress # colorsmds <- isomds(T,dim=2, maxit=50) # # plot(colorsmds$points[,1],colorsmds$points[,2],type="n",asp=1, main="The Color Circle\nFrom MDS Program in R", xlab="First Dimension",ylab="Second Dimension", xlim=c(-3.0,3.0),ylim=c(-3.0,3.0)) text(colorsmds$points[,1],colorsmds$points[,2],labels=row.names(colornames),adj=0) text(-2.0,2.5,paste("Stress = ", 0.01*round(colorsmds$stress, 2)),col="blue") #Note this neat trick -- Stress is returned as a percentage so that #multiplying it by 0.01 converts it to the KYST style. The "round(..)" command #gets us 4 digits after the decimal point**

- Repeat problem 1a of Homework 4 using
**mdscal_color_1.r**. Namely, run Tables 1.1 (crime data), 1.3 (nations data), 1.4 (one spoked wheels), 4.1 (color data), and 4.4 (Faces data -- note that these are*distances*!) in one, two, and three dimensions. Produce two-dimensional plots for each dataset and make a table showing the Stress in one, two, and three dimensions for**isomds(...)***and***KYST**for all the matrices. Notes:

- You have already run
**KYST**on all the above matrices except 4.4.

- You will have to use
**Epsilon**to change Tables 1.1, 1.3, 1.4, and 4.4 intomatrices!*square symmetric*

- To run
**mdscal_color_1.r**in one dimension, simply comment out the plotting commands!

- If the input data are distances/dissimilarites simply comment out the
transformation line.

- You have already run
- Make a Shepard Gradient of Generalization graph for each
two-dimensional configuration from part (a). Use the
**R**program below to do the plots:

R Program to do Shepard Gradient of Generalization graph (mdscal_color_3.r)

Below is a listing of the program:

For the color data, you should get the following two graphs:**# # mdscal_color_3.r -- Tests R Version of Kruskal MDS -- This version # Produces Shepard's Gradient of Generalization # # This needs MASS and pcurve packages # # The Data Must Be Transformed to Squared Distances Below # library(MASS) Note that this program is the same as the one library(pcurve) above through the call to isomds(...) # T <- matrix(scan("F:/R_Files_Office/colors.txt",0),ncol=14,byrow=TRUE) colornames <- read.csv("F:/R_Files_Office/color_coords.txt",header=T,row.names=1) attach(colornames) TT <- T # # Save Copy of Original Data # TTT <-T nrow <- length(T[,1]) ncol <- length(T[1,]) # y <- rep(0,((nrow*(nrow+1))/2)*5) This simply creates a matrix that will hold dim(y) <- c(((nrow*(nrow+1))/2),5) the "DATA" and "DIST" as in KYST # # # Transform the Matrix to Squared Distances # i <- 0 while (i < nrow) { i <- i + 1 j <- 0 while (j < ncol) { j <- j + 1 # # Adjust This Transformation According to the MAXIMUM similarity # Value! The Diagonal of the Matrix Should be all zeroes! # If the data are correlations this would be: # TT[i,j] <- (1.0 - T[i,j])**2 # If the maximum value for the Similarity was 10 this would be: # TT[i,j] <- (10.0 - T[i,j])**2 or # TT[i,j] <- ((10.0 - T[i,j])/5.0)**2 # TT[i,j] <- ((100 - T[i,j])/50)**2 } } T <- TT # # # T -- Input # dim=2 -- number of dimensions # dim <- 2 colorsmds <- isomds(T,dim, maxit=50) # # # Create Data for Gradient of Generalization Plots # i <- 1 This code creates the Euclidean distances kk <- 0 between the points estimated by isomds(...). while (i <= nrow) { These are stored in the matrix y(,) along with j <- i the original Similarities and the Dissimilarities while (j <= nrow) { computed above. k <- 0 dist <- 0.0 while (k < dim) { k <- k+1 dist <- dist + (colorsmds$points[i,k]-colorsmds$points[j,k])^2 } kk <- kk +1 y[kk,1] <- dist y[kk,2] <- T[i,j] y[kk,3] <- TTT[i,j] y[kk,4] <- i y[kk,5] <- j j <- j + 1 } i <- i + 1 } # This produces a plot of the Distances against the Dissimilarities plot(y[,1],y[,2], xlab="Psychological Distance",ylab="Observed Distance/Dissimilarity",col="blue") mtext(side=3,line=1.5,"Shepard's Theory of Generalization\nColor Data: Dissimilarities",font=2) lines(lowess(y[,1],y[,2],f=.2),lwd=3) text(10,2,"Line estimated \nUsing Lowess") # windows() This Command allows us to create the second plot # The Last One Drawn will be on Top # # This produces a plot of the Distances against the Similarities plot(y[,1],y[,3],ylim=c(0,100), xlab="Psychological Distance",ylab="Observed Similarity",col="red") mtext(side=3,line=1.5,"Shepard's Theory of Generalization\nColor Data: Similarities",font=2) lines(lowess(y[,1],y[,3],f=.2),lwd=3) text(10,80,"Line estimated \nUsing Lowess")**

Turn in both graphs for each matrix except for the Faces data (Table 4.4). Just turn in the Dissimilarities against the Distances plot for the Faces data (you can comment out the code for the other plot for this data).

- Repeat problem 1a of Homework 4 using
- In this problem we are going to revisit some of the analyses we did in
Problem 2.c and 2.d of Homework 9. Use
**Epsilon**to change the number of dimensions from "2" to "1" in**PERFSTRT.2000**and change the last number in the second line from "1" to "2" (this number tells the program that respondent number 2 is on the left or liberal side of the dimension -- it ensures that the rank ordering goes from liberal to conservative) then rename it**PERFSTRT.DAT**(do not overwrite any other**PERFSTRT.DAT**you may have). Your**PERFSTRT.DAT**file should look this:

We are going to use a modified version of**VOTE_THERM_2000.ORD NON-PARAMETRIC MULTIDIMENSIONAL UNFOLDING OF THERMOMETER DATA 1 91 20 40 2 (40A1,3900I1) (I5,1X,40A1,2I5,50F8.3)****Optimal Classification**more suited to analyzing roll calls from rank order data. The program is**PERFLRANK**. Download the program and place it in the same directory as your**PERFSTRT.DAT**and the candidate "roll call" file**VOTE_THERM_2000.ORD**.

Maximum Classification Scaling Program (PERFLRANK)

- Run
**PERFLRANK**using the above**PERFSTRT.DAT**and**VOTE_THERM_2000.ORD**. Turn in the**PERF21.DAT**file.

- The
**PERF25.DAT**shows the rank ordering of the 1,443 respondents included in the scaling along with the rank positions of the cutpoints for the candidate "roll calls." Note that because, the respondents are ordered from 17.5 to 1,419.5. The respondents are first shown in their rank ordering from 17.5 to 1,419.5 and then in the order that they appear in*tied ranks are allowed***VOTE_THERM_2000.ORD**. The 91 cutpoints corresponding to the candidate "roll calls" are at the bottom of**PERF25.DAT**and the rank position of the cutpoints are in the last column. This part of the file looks like this:

Use**1 1 436 568 291 0.333 1 6 392.500 2 2 691 609 80 0.869 1 6 809.000 3 3 781 369 99 0.732 1 6 1114.000 4 4 515 453 191 0.578 1 6 1058.750 5 5 503 539 166 0.670 1 6 688.000 6 6 494 462 178 0.615 1 6 922.500 7 7 428 463 212 0.505 1 6 797.750 8 8 558 488 84 0.828 1 6 809.000 9 9 588 427 328 0.232 6 1 171.000 10 10 422 604 340 0.194 1 6 779.250 11 11 720 567 65 0.885 1 6 788.250 12 12 662 313 95 0.696 1 6 1047.250 13 13 693 504 151 0.700 1 6 797.750 etc etc etc 85 85 633 573 169 0.705 1 6 715.750 86 86 678 488 60 0.877 1 6 896.750 87 87 707 235 91 0.613 1 6 1199.000 88 88 701 387 110 0.716 1 6 980.750 89 89 644 265 117 0.558 6 1 359.000 90 90 565 468 147 0.686 6 1 602.000 91 91 249 600 208 0.165 6 1 1349.500****Epsilon**to merge in the 91 pairs of candidate names that you created in question 1.a of Homework number 10 into this file. The first 13 lines of your file*should look exactly like this*!!!

Turn in the**CLINTON GORE 1 436 568 291 0.333 1 6 392.500 CLINTON BUSH 2 691 609 80 0.869 1 6 809.000 CLINTON BUCHANAN 3 781 369 99 0.732 1 6 1114.000 CLINTON NADER 4 515 453 191 0.578 1 6 1058.750 CLINTON MCCAIN 5 503 539 166 0.670 1 6 688.000 CLINTON BRADLEY 6 494 462 178 0.615 1 6 922.500 CLINTON LIEBERMAN 7 428 463 212 0.505 1 6 797.750 CLINTON CHENEY 8 558 488 84 0.828 1 6 809.000 CLINTON HILLARY 9 588 427 328 0.232 6 1 171.000 CLINTON DEMPARTY 10 422 604 340 0.194 1 6 779.250 CLINTON REPUBPARTY 11 720 567 65 0.885 1 6 788.250 CLINTON REFORMPTY 12 662 313 95 0.696 1 6 1047.250 CLINTON PARTIES 13 693 504 151 0.700 1 6 797.750 etc etc etc****Epsilon**macro you used to create your file and a complete listing of the file.

- There are 13 cutpoints associated with former President
Clinton. The cutpoint between former President Clinton and former Vice-President
Gore is 392.5. The "1 6" to the immediate left of the rank tells you that
Yea (vote for Clinton) was
392.5 and Nay (vote for Gore) was*below*392.5. This implies that former President Clinton's rank position is*above*392.5 and former Vice-President Gore's rank position is*below*392.5. Although the data has some noise in it, you should be able to pin down a range within which former President Clinton's rank should lie. For example, note that the cutpoint between former President Clinton and Hillary Clinton is 171.0 and the "6 1" means that Hillary Clinton is*above*171.0 and former President Clinton is is*below*171.0.*above*

Using the above reasoning, locate a range of ranks for former President Clinton, former Vice-President Gore, President Bush, and Hillary Clinton. Defend you reasoning for each person!

- Use
**Epsilon**to merge the "vote for" variable in**VOTE_THERM_2000.ORD**into the respondent rank file (cut the respondents out of**PERF25.DAT**first). Turn in a listing of the**Epsilon**macro you use to create the file.

- Make smoothed histograms of the Bush and Gore voters from the above file. These histograms should look similar to those you made in question 1.c of Homework 8.

- Run