**
POLI 279 MEASUREMENT THEORY
Seventh Assignment
Due 9 June 2006**

- In this problem we are going to use
the Parametric Bootstrap version of
**W-NOMINATE**to generate standard errors for the 104^{th}Senate. Download the program, control card file, and data file and place them in the same directory.

WNOMJLEWIS -- W-NOMINATE Program-
NOMSTART_JLEWIS.DAT -- W-NOMINATE
Bootstrap Control Card File for 104

^{th}Senate

SEN104KH.ORD -- 104^{th}Senate roll call data**W-NOMINATE**is discussed in detail with several examples on the W-NOMINATE Page and the bootstrap file output is discussed in detail on the Parametric Bootstrap page.

**NOMSTART_JLEWIS.DAT**looks like this:

It is identical to the**SEN104KH.ORD NOMINAL MULTIDIMENSIONAL UNFOLDING OF 104TH SENATE 919 1 22 2 36 11 Number Dimensions, Number Characters to Read from Header, Number of Bootstrap Trials 15.0000 0.5000 0.0250 20 (36A1,15000I1) (1x,I4,36A1,1X,4i4,51f7.3) (I4,1X,36A1,80F10.4)****NOMSTART.DAT**used for question 5.e-g of Homework 6 except for the "11" (colored red) in the fourth line of the file. This is the number of bootstrap trials. Normally it is set to 1001 but we will use 101 in this problem.

Put the files in the same directory and run**WNOMJLEWIS**. You will get 7 output files --**fort.26**,**fort.56**,**NOM21.DAT**,**NOM23.DAT**,**NOM31.DAT**,**NOM33.DAT**, and**NOM36.DAT**. The**NOM21.DAT - NOM36.DAT**files are explained on the W-NOMINATE Page.

**FORT.26**are the parametric bootsrapped legislator coordinates. The file will look something like this:

The legislator coordinates are are the first two columns after the name of the legislator (shown in red). For example, former President Clinton's coordinates are -0.9160 and -0.3336.**1 1049990999 0USA 10000CLINTON -0.9160 -0.3336 -0.8668 -0.2853 0.1039 0.2095 1.0000 0.1436 0.1436 1.0000 2 1041470541 0ALABAMA 10000HEFLIN -0.3923 -0.2917 -0.3546 -0.4203 0.0510 0.1876 1.0000 0.7794 0.7794 1.0000 3 1049465941 0ALABAMA 20000SHELBY 0.5858 -0.3131 0.6871 -0.3176 0.1162 0.0603 1.0000 0.2238 0.2238 1.0000 4 1041490781 0ALASKA 20000MURKOWSKI 0.6600 -0.1009 0.7623 -0.0338 0.1138 0.1090 1.0000 0.0790 0.0790 1.0000 5 1041210981 0ALASKA 20000STEVENS 0.4199 -0.3162 0.5695 -0.3110 0.1645 0.1222 1.0000 0.1487 0.1487 1.0000 etc etc etc 98 1044930873 0WASHING 10000MURRAY -0.9235 -0.1667 -0.8936 -0.1832 0.0555 0.0916 1.0000 -0.1436 -0.1436 1.0000 99 104 136656 0WEST VI 10000BYRD, ROBER -0.6504 -0.5292 -0.6249 -0.6165 0.0465 0.1304 1.0000 0.1703 0.1703 1.0000 100 1041492256 0WEST VI 10000ROCKEFELLER -0.8080 -0.0714 -0.7819 -0.0967 0.0416 0.1585 1.0000 0.6202 0.6202 1.0000 101 1044930925 0WISCONS 10000FEINGOLD -0.8300 0.5577 -0.7844 0.6024 0.0546 0.0564 1.0000 0.6076 0.6076 1.0000 102 1041570325 0WISCONS 10000KOHL -0.6772 0.5656 -0.6449 0.7452 0.0395 0.1934 1.0000 0.5894 0.5894 1.0000 103 1041471068 0WYOMING 20000SIMPSON 0.3605 -0.4309 0.5124 -0.4402 0.1725 0.1322 1.0000 0.4797 0.4797 1.0000 104 1041563368 0WYOMING 20000THOMAS 0.7165 0.3480 0.7850 0.3535 0.0824 0.1281 1.0000 0.5629 0.5629 1.0000**

- Run 101 bootstrap trials using
**WNOMJLEWIS**(change**011**to**101**in the**NOMSTART_JLEWIS.DAT**file and run the program). E-Mail me the**NOM21.DAT**file from the run.

- Use
**R**to plot the legislators in two dimensions from the**FORT.26**file. Use "D" for Non-Southern Democrats, "S" for Southern Democrats, "R" for Republicans, and "P" for President Clinton. This graph should be in the same format as the one you did for question 2.f of Homework 5.

- The sixth and seventh columns of numbers (shown in blue) are the bootstrapped standard
errors for the first and second dimensions, respectively. The last four columns are the Pearson
correlations between the bootstrapped first and second dimension coordinates computed across the
bootstrap trials. Because we are only working with two dimensions just the correlation between
the first and second dimension estimates is relevant (shown in Purple).

- We are going to make a graph like the ones shown in Figures 4.4 and 4.5 of my book. Download the
following
**R**program:

Plot_Bootstrap_New.r -- R Program to Plot Parametric Bootstrap Output, FORT.26

Here is what**Plot_Bootstrap_New.r**looks like:

Run this program and turn in the plot. It should look similar to this:**# # Plot_Bootstrap_New.r -- Program reads parametric bootstrap files posted # at http://voteview.org/Lewis_and_Poole.htm # and plots the legislator ideal points and the # standard errors # # Remove all objects just to be safe # rm(list=ls(all=TRUE)) # library(MASS) library(stats) library(ellipse) You Will Need to Download and Install this Library # # Set up to read Parametric Bootstrap File # rc.file <- "c:/ucsd_homework_7/fort.26" # # The variable fields and their widths # rc.fields <- c("counter","cong","id","state","dist","lstate","party", "eh1","eh2","name","wnom1","wnom2","wnom1bs","wnom2bs", "se1","se2","r11","r12","r21","r22") # # Note -- For some files the field widths will be (e.g., H108_BS_1000_2.DAT): # (5,3,5,2,2,7,4,1,1,11,8,7,7,7,7,7,7) # rc.fieldWidths <- c(4,4,5,2,2,7,4,1,1,11,10,10,10,10,10,10,10,10,10,10) # # Read the vote data from fwf (FIXED WIDTH FORMAT -- FWF) # TT <- read.fwf(file=rc.file,widths=rc.fieldWidths,as.is=TRUE,col.names=rc.fields) party <- TT[,7] state <- TT[,4] wnom1 <- TT[,11] wnom2 <- TT[,12] std1 <- TT[,15] std2 <- TT[,16] corr12 <- TT[,18] # nrow <- length(TT[,1]) ncol <- length(TT[1,]) # plot(TT[,11],TT[,12],type="n",asp=1, main="", xlab="", ylab="", xlim=c(-1.0,1.0),ylim=c(-1.0,1.0),font=2) points(wnom1[party == 100 & state >= 40 & state <= 51],wnom2[party == 100 & state >= 40 & state <= 51],pch='S',col="red") points(wnom1[party == 100 & state == 53],wnom2[party == 100 & state == 53],pch='S',col="red") points(wnom1[party == 100 & state == 54],wnom2[party == 100 & state == 54],pch='S',col="red") points(wnom1[party == 100 & (state < 40 | state > 54)],wnom2[party == 100 & (state < 40 | state > 54)],pch='D',col="red") points(wnom1[party == 100 & state == 52],wnom2[party == 100 & state == 52],pch='D',col="red") points(wnom1[party == 200],wnom2[party == 200],pch='R',col="blue") # Main title mtext("104th Senate From W-NOMINATE\nWith Bootstrapped Standard Errors",side=3,line=1.50,cex=1.2,font=2) # x-axis title mtext("Liberal - Conservative",side=1,line=2.75,cex=1.2) # y-axis title mtext("Social/Lifestyle Issues",side=2,line=2.5,cex=1.2) # # # This code does the cross-hairs for the standard errors. If the # correlation is greater than .15 between the two dimensions, # the 95% confidence ellipse is shown # for (i in 1:nrow) { # # These two statements do the cross-hairs # lines(c(wnom1[i],wnom1[i]),c(wnom2[i]-1.96*std2[i],wnom2[i]+1.96*std2[i]),col="gray") lines(c(wnom1[i]-1.96*std1[i],wnom1[i]+1.96*std1[i]),c(wnom2[i],wnom2[i]),col="gray") # # This if statement does the ellipse # if (abs(corr12[i]) > .15){ lines(ellipse(x=corr12[i],scale=c(std1[i],std2[i]), centre=c(wnom1[i],wnom2[i])), col="gray") } } #**

**FORT.56**are the average**Optimal Classification**coordinates generated by running the Optimal Classification Program on every bootstrap draw. The file will look something like this:

The average legislator coordinates are are the third and fourth columns after the name of the legislator (shown in red). For example, former President Clinton's coordinates are -0.8467 and -0.2906.**1 1049990999 0USA 10000CLINTON -0.3160 -0.7979 -0.8467 -0.2906 0.5711 0.5775 0.1089 0.2070 1.0000 -0.8322 -0.8322 1.0000 2 1041470541 0ALABAMA 10000HEFLIN -0.2219 0.0278 0.0011 -0.2066 0.3139 0.3403 0.1975 0.2220 1.0000 -0.9512 -0.9512 1.0000 3 1049465941 0ALABAMA 20000SHELBY 0.6878 -0.0150 0.7162 -0.1751 0.0633 0.1796 0.0529 0.0582 1.0000 -0.3841 -0.3841 1.0000 4 1041490781 0ALASKA 20000MURKOWSKI 0.7285 -0.0996 0.7283 -0.0236 0.0781 0.0929 0.0741 0.0445 1.0000 -0.5790 -0.5790 1.0000 5 1041210981 0ALASKA 20000STEVENS 0.6308 -0.1836 0.6209 -0.2080 0.0682 0.0844 0.0639 0.0763 1.0000 -0.4706 -0.4706 1.0000 etc etc etc 98 1044930873 0WASHING 10000MURRAY -0.8772 -0.0015 -0.9068 -0.1126 0.0506 0.1520 0.0378 0.0919 1.0000 0.0408 0.0408 1.0000 99 104 136656 0WEST VI 10000BYRD, ROBER -0.6919 -0.5972 -0.6128 -0.3697 0.0959 0.2762 0.0448 0.1300 1.0000 -0.7018 -0.7018 1.0000 100 1041492256 0WEST VI 10000ROCKEFELLER -0.7673 -0.0990 -0.7713 -0.1318 0.0585 0.0952 0.0554 0.0842 1.0000 0.2288 0.2288 1.0000 101 1044930925 0WISCONS 10000FEINGOLD -0.9161 0.3985 -0.8439 0.3498 0.0901 0.1056 0.0457 0.0875 1.0000 0.7287 0.7287 1.0000 102 1041570325 0WISCONS 10000KOHL -0.6915 0.2468 -0.7519 0.4653 0.1107 0.2343 0.0859 0.0404 1.0000 -0.2221 -0.2221 1.0000 103 1041471068 0WYOMING 20000SIMPSON 0.5253 -0.1192 0.5113 -0.0559 0.0463 0.1029 0.0417 0.0743 1.0000 -0.3008 -0.3008 1.0000 104 1041563368 0WYOMING 20000THOMAS 0.7193 0.1231 0.7215 0.1730 0.0458 0.1302 0.0434 0.1130 1.0000 -0.4295 -0.4295 1.0000**

Use**R**to plot the legislators in two dimensions from the**FORT.56**file. Use "D" for Non-Southern Democrats, "S" for Southern Democrats, "R" for Republicans, and "P" for President Clinton. This graph should be in the same format as the one you did for question 2.f of Homework 5.

- Write an
**Epsilon**keyboard macro as a text file that combines the legislator coordinates from**FORT.26**with those from**FORT.56**. Assume that the macro begins with**FORT.26**in the top window,**FORT.56**in the second window, and the combined file in the third window (see question 1.a of Homework 5). Leave the header on each record. Turn in a listing of the macro and a*neatly formatted*listing of the file.

- Let
**A**be the matrix of legislator coordinates from**FORT.26***after subtracting off the column means*, and let**B**be the matrix of legislator coordinates from**FORT.56***after subtracting off the column means*. Note that subtracting off the column means of both matrices centers both at the origin, (0.0, 0.0). Solve for the orthogonal procrustes rotation matrix,**T**, for**B**. Namely, we want to minimize:

**L(T) = tr(A - BT)(A - BT)'**

The solution is:

**T = VU'**where

**A'B = ULV'**

where**ULV'**is the Singular Value Decomposition of**A'B**(see Borg and Groenen, pp. 430-432).

In**R**you can perform the decompostion with the svd command. For example:

**C <- t(A)%*%B**

svddecomp <- svd(C)

**svddecomp$u**has the matrix**U**

**svddecomp$v**has the matrix**V**

**svddecomp$d**has the diagonal of**L**

Note that you can check your work as we discussed in class by doing the following:

**D <- diag(svddecomp$d)**

U <- svddecomp$u

V <- svddecomp$v

ABCHECK <- U%*%D%*%t(V)

errorcheck <- sum((C-ABCHECK)^2)

Solve for**T**and turn in a*neatly formatted*listing. Compute the Pearson r-squares between the corresponding columns of**A**and**B***before*and*after*rotating**B**.

- Run 101 bootstrap trials using
- In this problem we are going to use Simon Jackman's
Bayesian MCMC Quadratic-Normal scaling program IDEAL.

Go to the IDEAL beta website and download**rollcall_0.3.3.zip**. (If the site is down rollcall_0.3.3.zip is here.) Install the package from the**Install Package From Local Zip File**in the**Packages**drop-down menu in**R**.

Download the**R**programs below along with the**Rdata**file for the 104^{th}Senate roll calls and put them in the same directory:

idealKeith.r -- Program to run**IDEAL**on the 104^{th}Senate.-
S104.Rdata --

**R**Data file for 104^{th}Senate**idealKeith.r**looks like this:

The**# # idealKeith.r -- Implements Simon's IDEAL in R # rm(list=ls(all=TRUE)) # library(rollcall) load("C:/ucsd_homework_7/s104.Rdata") rc <- rollcall(s104) csts <- constrain.legis(rc,x=list("KENNEDY, ED"=-1, "HELMS"=1),d=1) This Sets up the Constraints kpideal <- ideal(rc, priors=csts, startvals=csts, store.item=TRUE) sumkpideal <- summary(kpideal,include.beta=TRUE) write.table(sumkpideal$x.quant,"c:/ucsd_homework_7/tab_simon2_104a.txt") write.table(sumkpideal$beta.quant,"c:/ucsd_homework_7/tab_simon3_104a.txt") #****tab_simon2_104a.txt**file has the esimated legislator ideal points in a format similar to those from**MCMCPack**that you estimated for question 1 of Homework 5. The file should look something like this:**"Posterior Mean" "2.5%" "97.5%" "HEFLIN" -0.260539161797188 -0.324277175440612 -0.199695576187364 "SHELBY" 1.23019206655678 1.06776517589545 1.36349750197643 "MURKOWSKI" 1.63087942782254 1.40849022832179 1.84561135364064 "STEVENS" 1.04610545982508 0.896182928917575 1.17542884158099 "KYL" 2.37619418505450 2.08958357054717 2.75349078670169 etc etc etc "ROCKEFELLER" -1.10508436917944 -1.25041868040856 -0.982633707475099 "FEINGOLD" -1.03161420524957 -1.14931288681370 -0.923294768025014 "KOHL" -0.792982659833952 -0.884197325699718 -0.720465242777936 "SIMPSON" 0.912732416007879 0.815117422171576 1.02451977202073 "THOMAS" 1.76901626257382 1.58540354678476 1.96376294419763**- Write a macro similar to the one you used in
question 1.a. and 1.b. of Homework 5 to make a nicely formatted
file of the legislator coordinates. Turn in a copy of this macro.

- Replicate question 1.c. of Homework 5. Write
an
**R**program that graphs the rank ordering from**Optimal Classification**(horizontal axis) against the**IDEAL**medians (vertical axis). Label the axes appropriately and label a few of the Senators including Campbell (**D-CO**) and Campbell (**R-CO**).

- Report the correlation between the
**OC**rank ordering and the**IDEAL**medians.

- Use
**Epsilon**to combine the file you created in (a) with that from question 1.a. and 1.b. of Homework 5 and include in that file the 2.5% and 97.5% quantiles corresponding to the medians of both procedures. Report the Pearson correlation between the lengths of the "confidence intervals" for the two Bayesian procedures and also report the Pearson correlation between the**IDEAL**medians and**MCMCPack**medians.

**(20 Points Extra Credit)**Write an**R**program that reads the above file and produces a two dimensional plot of the legislator ideal points where the horizontal dimension coordinates are the**IDEAL**medians and the vertical dimension coordinates are the**MCMCPack**medians. Put cross-hairs through the points with lengths equal to the corresponding distance between the 2.5% and 97.5% quantiles. Plot Northern Democrats with "D" tokens, Southern Democrats with "S" tokens, and Republicans with "R" tokens. Note that you might want to divide all of the**MCMCPack**quantile values by 2 because**IDEAL**has Kennedy/Helms at -1/+1 and**MCMCPack**has Kennedy/Helms at -2/+2.

- Write a macro similar to the one you used in
question 1.a. and 1.b. of Homework 5 to make a nicely formatted
file of the legislator coordinates. Turn in a copy of this macro.
- In this problem we will continue our comparison of
**Optimal Classification**,**IDEAL**, and**MCMCPack**using the 102^{nd}Senate. Download these data files:

Sen102kh.ord --**ASCII**Data file for 102^{nd}Senate (used with**PERFL.EXE**)

Sen102kh.dta --**Stata**Data file for 102^{nd}Senate (used with**keith2.r**)

S102.Rdata --**IDEAL**Data file for 102^{nd}Senate (used with**idealKeith.r**)

and place them in the appropriate directories where you have their associated programs.

- Run
**Optimal Classification (PERFL.EXE)**in one dimension on**SEN102KH.ORD**. There were 550 roll calls in the 102^{nd}Senate. Note that the Optimal Classification Program Page has detailed instructions on how to set up**PERFSTRT.DAT**. Turn in a copy of**PERFSTRT.DAT**and**PERF21.DAT**.

- Run
**keith2.r**to generate**sen102kh.rda**and then run**keithMCMC.r**to get the Senator medians and 2.5% and 97.5% quantiles. Combine the rank ordering from**Optimal Classification**with the**tab2.txt**output from**MCMCPack**and graph the rank ordering (horizontal axis) against the Bayesian MCMC medians (vertical axis) as you did for Question 2.c of Homework 5.

- Report the Pearson correlation between the
**OC**rank ordering and the**MCMCPack**medians.

- Graph the rank
ordering from
**Optimal Classification**(horizontal axis) against the**IDEAL**medians (vertical axis).

- Report the correlation between the
**OC**rank ordering and the**IDEAL**medians.

- Report the Pearson correlation between the lengths of the "confidence intervals"
(2.5%, 97.5%) for the two Bayesian procedures and also
report the Pearson correlation between the
**IDEAL**medians and**MCMCPack**medians.

- Run
- In this problem we are going to partially replicate Problem 5 of Homework 4
using the 104
^{th}Senate --**SEN104KH.ORD**. Place**HOUSYM3.EXE**,**SEN104KH.ORD**, and**SYMSTRT3.DAT**in the same directory. Use**Epsilon**to change**SYMSTRT3.DAT**so that**HOUSYM3.EXE**reads**SEN104KH.ORD**and writes**SEN104.DAT**. Note that there are 919 roll calls in**SEN104KH.ORD**.

- Turn in a copy of the
**HOUSYM3.DAT**output file.

- Turn in a plot of the eigenvalues. Use
**R**to do the plot.

- Use
**Epsilon**to enter the following commands on top of your agreement score output file:

Be sure to put the**TORSCA PRE-ITERATIONS=3 DIMMAX=3,DIMMIN=1 COORDINATES=ROTATE ITERATIONS=50 REGRESSION=DESCENDING DATA,LOWERHALFMATRIX,DIAGONAL=PRESENT,CUTOFF=.01 104TH U.S. SENATE AGREEMENT SCORES 104 1 1 (36X,104F4.0) *********agreement score file********* COMPUTE STOP****COMPUTE**and**STOP**lines on the bottom of the agreement score file.

Run**KYST**on this file and report the STRESS values for one, two, and three dimensions.

- Use
**Epsilon**to combine the**one-dimensional****KYST**coordinates with the**OC**rank orderings, the**MCMCPack**medians, and the**IDEAL**medians with the headers on the file. Turn in aof this file. Compute the 4 by 4 matrix of Pearson correlations and turn in a*neatly formatted listing*table of these correlations.*neatly formatted and clearly labeled*

- Any thoughts about what the above table of correlations tells us about this enterprise as a cumulative
science?

- Turn in a copy of the