Vector base Cosine Similarity for two Matrices -- R in UNIX
Dear All,
I am facing a problem and I would be Thankful if you can help
Hope this is the right place to ask this question
I have two matrices of (row=10, col=3) and I want to get the cosine similarity between two lines (vectors) of each file --> the result should be (10,1) of cosine measures
I am using cosine function from Package(lsa) from R called in unix but I am facing problems with it
if these files had only one row per file I can calculate the cosine similarity as following
but facing problems reading lines of two files into Vectors to do the same
I have tried to write a code, it does not give any error but does not create anything and I dont know what I am doing wrong --- (new to R)
Code:
con <- file('data01.txt', open="r")
con2 <- file('data02.txt', open="r")
a <- list();
b <- list();
test <- list();
current.line01 <- 1
current.line02 <- 1
while (length(data01 <- readLines(con, n = 10, warn = FALSE)) > 0) {
while (length(data02 <- readLines(con2, n = 10, warn = FALSE)) > 0) {
a[[current.line01]]<- c(data01)
b[[current.line02]]<- c(data02)
test <-cosine(a[[current.line01]], b[[current.line02]])
write.table(test , "test.txt")
current.line01 <- current.line + 1
current.line02 <- current.line + 1
}
}
close(con)
close(con2)
I have no knowledge of R, but I see you have two while loops nested.
Shouldn't it be just one while loop from file1,
and within that you read one record from file2?
Something like
Code:
while (length(data01 <- readLines(con, n = 10, warn = FALSE)) > 0) {
data02 <- readLines(con2, n = 10, warn = FALSE)
...
}
This User Gave Thanks to MadeInGermany For This Post:
Yes, forgot to rewind the tape on that inner file before reusing it if you want an n squared cartesian product, but perhaps you want more of a paste: line N of both files only.
If you read a file to EOF with the inner while, then the outer while loops, the inner file handle is still at EOF. Sequential disk files are like tape drives, and FILE* in C has a redundant command rewind(), which is an fseek to 0 absolute. Man Page for rewind (opensolaris Section 3) - The UNIX and Linux Forums Of course, R may rewind for you, but that seems a bit too magic.
I have changed few things including the inner loop ... but now I get an error
I cant personally see how is it going to read each of the second files lines...
Code:
con <- file('data01.csv', open="r")
con2 <- file('data02.csv', open="r")
current.line<- 1
while (length(data01 <- readLines(con, n = 10, warn = FALSE)) > 0) {
data02 <- readLines(con2, n = 10, warn = FALSE)
a[[current.line]]<- as.vector(data01)
b[[current.line]]<- as.vector(data02)
test<- cosine (a, b)
write.csv(test, file="test.txt", sep=",")
current.line <- current.line+ 1
}
close(con)
close(con2)
error I get
Code:
Error in crossprod(x, y) :
requires numeric/complex matrix/vector arguments
Hi All,
we have a requirement to split a content in a text file every 5 rows and write in a new file .
conditions:
if 5th line falls between center of the statement . it should look upto after ";"
files are below format:
1 UPDATE TABLE TEST1 SET VALUE ='AFDASDFAS'
2 WHERE... (3 Replies)
Hello all,
I have square matrices that look like the following, I want to merge these matrices together, and add the file names as headers. This is a simple example with two variables, actually I have ~1500 variables and 10 files.
The order of variables in the matrices are consistent.
Please... (2 Replies)
I really need help in this :(
I have a file and would like to calculate the cosine similarity of the values in it...
For now I do use R which has an easy function for doing so
test <- as.matrix(read.csv(file="file.csv", sep=",", header=FALSE))
result<- cosine(t(test))
I am using unix of... (3 Replies)
Input_file
data1 USA 100 ASE
data3 UK 20 GWQR
data4 Brazil 40 QWE
data2 Scotland 60 THWE
data5 USA 40 QWERR
Reference_file
USA 12312 34532
1324 Brazil 23321
231 3421 Scotland
342 34235 UK
231 141 England... (1 Reply)
hi all,
iam using unix command in the basesas programming.
i need to delete one folder which is dynamically creating when SAS script runs.
rm -rf " dynamic foldername"
iam not able to delete the folder it is saying
rm: cannot remove directory `test_lin_prod_06_20091211_0516':... (2 Replies)
Hi, I have done this year ago, and now I need to do it again, but did not remember how I do it. I have a slip printer on a windows xp workstation and i need to print from SCO unix application to that printer. I try to create a remote printer but the only option available is unix, the other to... (0 Replies)
in win32 platform, i can easily find some GUI based ftp application like cuteFtp, WsFtp and etc which provides GUI + resuming download.
pls recommend me some similar application which runs on Sun Solaris sparc 8.
hopefully it is free.
thank you very much. (1 Reply)