XLConnect looks very nice: it is fully featured, and is implemented as objects. It uses a Java library to do the Excel imports, apparently. This would seem like a detail, but it ends up adding a very important wrinkle. According to the developer, Java's "ToString" function adds a ".0" to any values that look numeric. So, even if you tell XLConnect's readWorksheet function to read the column in as a character data type (using the colTypes argument), it will add a ".0" to every value that looks to Java like a number.
Here's an example. Let's make an excel document called XLConnect_test.xlsx:
Now, let's try to import it via XLConnect:
> require(XLConnect)Notice how XLConnect adds ".0" to each value, even though those columns are supposed to be character strings? That's not good. The nicely-communicative developer says that they're working on this for a future version. In the meantime, here's the hack I've written so the rest of us can move forward:
> xlc_df = readWorksheetFromFile(file="XLConnect_test.xlsx", sheet=1)
> xlc_df
Col1 Col2
1 ABC 1.0
2 123.0 2.0
3 123ABC ABC
> xlc_df = readWorksheetFromFile(file="XLConnect_test.xlsx", sheet=1, colTypes=c("character", "character"))
> xlc_df
Col1 Col2
1 ABC 1.0
2 123.0 2.0
3 123ABC ABC
xlc_dot_zero_hack <- function(df) {Just import your Excel data, and then run the function above on your imported dataframe. Don't forget to reassign the result to a dataframe. In our example, we would write:
# looks for all numerics that had a ".0" added to them by XLConnect, and removes the ".0"
return(
as.data.frame( lapply( df, FUN = function(x) sub("^(\\d*)\\.0$", "\\1", x, perl=TRUE) ) )
)
}
> xlc_df = xlc_dot_zero_hack(xlc_df)VoilĂ ! Note that if your data has any strings that *actually* have a ".0" at the end, this hack will have unexpected results. Specifically, it will remove *all* ".0"s that come after a string of digits.
> xlc_df
Col1 Col2
1 ABC 1
2 123 2
3 123ABC ABC
I hope this might help others out there with the same conundrum!
Wednesday, September 5, 2012
Cross-References don't show up in Word's dialog box?
I was having an issue with a manuscript I'm putting together recently. I was passing it back and forth with my advisor, and I was using Track Changes. As I added some figures with captions, I then wanted to refer to those figures in the text via a cross-reference.
When I clicked on Insert -> Cross-Reference in Word 2010, the recently-inserted captions weren't in the list!
Ugh, *another* Word screwup with complex documents.
Well, after a lost day of screwing around with the problem (and sleeping on it), I came across the solution.
The problem seems to be that Word doesn't like you to insert your figures and captions as a "tracked change". I went back and "accepted" all my insertions of figures and captions. And, voila! The figures now show up in the cross-reference dialog box. Now, whenever I insert a new figure, I'll either turn off track changes for the moment, or just go back and accept the insertion.
Hope this helped somebody else!
When I clicked on Insert -> Cross-Reference in Word 2010, the recently-inserted captions weren't in the list!
Ugh, *another* Word screwup with complex documents.
Well, after a lost day of screwing around with the problem (and sleeping on it), I came across the solution.
The problem seems to be that Word doesn't like you to insert your figures and captions as a "tracked change". I went back and "accepted" all my insertions of figures and captions. And, voila! The figures now show up in the cross-reference dialog box. Now, whenever I insert a new figure, I'll either turn off track changes for the moment, or just go back and accept the insertion.
Hope this helped somebody else!