Resolved: R scripts - Best practices

Question

Hello,

In the previous lecture, we have two possible ways to insert a new column to a dataframe:

my.data$MarkScreenTime <- mark
(or)
my.data[["MarkScreenTime"]] <- mark

Then, in this lecture, I tried to use the other syntax when we set the null values of "species" column as "Unknown":

(from)    my.wars$species[is.na(my.wars$species)] <- "Unknown"
(to)      my.wars[["species"]][is.na(my.wars["species"])] <- "Unknown"

Surprisingly, R accepted it, and I got the same result.

However, I remember that in the beginning of this course,
vector . name is a good way, while
vector _ name is a bad way of naming vectors

I like to ask the same question with using "data.frame $ column.name" and "data.frame [ "column_name" ] . (I added whitespaces because the formatting of bold letters got messed up when I posted the question the first time)

Which of these two is the better way of calling column names?

Kind regards,
Carl

Answer 1

Hi Carl,
thanks for reaching out! To your question, I'm not entirely sure which of the two variants is better, and I think if both are accepted by R, then you're free to choose. I mostly use $ because I'm used to this being the syntax in R. The [""] approach is similar to Python and I suspect that this is why it's been incorporated in the R syntax(or it might be the other way around). So, I'm not sure if there's a definitive guideline on which to use, imo both are acceptable ways to access a column in an R data frame, so feel free to use either version.
Hope this helps!

Best,
365 Eli

Answer 2

Yeah, that's what I thought too. But upon further reading, I found out that the $ syntax can be used in dataframes only and not matrices. I'm getting more inclined with $ though as it's cleaner and can help me distinguish R and Python in Jupyter.

Thank you.

Resolved: R scripts - Best practices

Submit an answer