Anyone feel like talking "R"

Questions on how we spend our money and our time - consumer goods and services, home and vehicle, leisure and recreational activities
Benp404
Posts: 28
Joined: Wed Oct 14, 2015 7:17 pm

Anyone feel like talking "R"

Post by Benp404 » Wed Jan 13, 2016 10:05 pm

Hi,

I am taking a data science class at Coursera. I am a little bit behind in the second class in a series of 10, I may retake it. Anyone have any insight in to the software that would help or feel like discussing what some of the code does?

JZinCO
Posts: 168
Joined: Fri Mar 20, 2015 6:32 pm

Re: Anyone feel like talking "R"

Post by JZinCO » Wed Jan 13, 2016 10:36 pm

I use it every day at work; have been using it for about 7 years now. I love it and try to use R in lieu of other platforms all the time; thankfully alot of developers have worked on packages and R is not a "stand alone program". You can use R to interact with GIS (geospatial information systems), databases, python programs, fortran, any API really. For example, when I started my job they wanted me to use command prompts to talk to our servers in SQL. I don't know SQL! So, I use a library in R that can act as my interface the library does the R->SQL translation for me.
I don't have any great unsolicited insight, except to say that package vignettes and stack exchange are your friend :)

User avatar
David Jay
Posts: 5773
Joined: Mon Mar 30, 2015 5:54 am
Location: Michigan

Re: Anyone feel like talking "R"

Post by David Jay » Wed Jan 13, 2016 10:49 pm

Sorry, Ben.

I'm an old "C" guy. I can even write "C" in C++ or C# ;)
Prediction is very difficult, especially about the future - Niels Bohr | To get the "risk premium", you really do have to take the risk - nisiprius

User avatar
Epsilon Delta
Posts: 7432
Joined: Thu Apr 28, 2011 7:00 pm

Re: Anyone feel like talking "R"

Post by Epsilon Delta » Wed Jan 13, 2016 10:52 pm

David Jay wrote:Sorry, Ben.

I'm an old "C" guy. I can even write "C" in C++ or C# ;)
At least it's not Fortran in C++. Or worse, Visual Basic in Visual Basic.

b23
Posts: 21
Joined: Wed Dec 30, 2015 12:48 pm

Re: Anyone feel like talking "R"

Post by b23 » Wed Jan 13, 2016 11:40 pm

Myself started with C then C++, FoxPro, Visual Basic, PL/SQL, JavaScript, VBScript, ASP, Unix Shellscript, PHP, Java, JSP, VB.net, C#, ASP.net and other related technologis.... :twisted:

stgrimes
Posts: 9
Joined: Sat Aug 10, 2013 10:01 am
Location: Philadelphia, PA
Contact:

Re: Anyone feel like talking "R"

Post by stgrimes » Wed Jan 13, 2016 11:57 pm

I'm starting to look at the PerformanceAnalytics package in R. I've been enjoying this series from Capital Spectator:

http://www.capitalspectator.com/portfol ... -analysis/

I could be convinced to work through that class with you!

User avatar
David Jay
Posts: 5773
Joined: Mon Mar 30, 2015 5:54 am
Location: Michigan

Re: Anyone feel like talking "R"

Post by David Jay » Thu Jan 14, 2016 1:25 am

Epsilon Delta wrote:
David Jay wrote:Sorry, Ben.

I'm an old "C" guy. I can even write "C" in C++ or C# ;)
At least it's not Fortran in C++. Or worse, Visual Basic in Visual Basic.
Hey, Fortran IV was my first programming class in 1976. MA306 (i.e. math department, there was no such thing as CS).

[edit] sorry, way off topic...
Prediction is very difficult, especially about the future - Niels Bohr | To get the "risk premium", you really do have to take the risk - nisiprius

quantAndHold
Posts: 2031
Joined: Thu Sep 17, 2015 10:39 pm

Re: Anyone feel like talking "R"

Post by quantAndHold » Thu Jan 14, 2016 10:12 am

Not R, but I just finished the Stanford machine learning class on Coursera. It gave me a new appreciation for the joys of MATLAB.

Starper
Posts: 105
Joined: Sun Oct 26, 2014 10:02 pm

Re: Anyone feel like talking "R"

Post by Starper » Thu Jan 14, 2016 10:21 am

I am interested in learning R. What are the best sources and how long will it take for somebody with an analytical background (already know SQL and other tools).

User avatar
ixohoxi
Posts: 117
Joined: Fri Nov 22, 2013 2:58 pm

Re: Anyone feel like talking "R"

Post by ixohoxi » Thu Jan 14, 2016 10:23 am

I'm signed up for the Exploratory Data Analysis course on Coursera. I couldn't submit the swirl() assignments, got an SSL error. The course content is pretty easy, though.
Henceforth, content shall be my aim, and anticipation my joy. -Alfred Billings Street

jridger2011
Posts: 458
Joined: Sun Feb 06, 2011 4:17 pm

Re: Anyone feel like talking "R"

Post by jridger2011 » Thu Jan 14, 2016 10:39 am

I feel like talking R. Let me know what topics would be fun to discuss.

There are some paths that are fun to explore:

1. Data Analysis => turn your old SAS code into R. If stats is not something you do day to day, move to option 2.

2. Data Aggregation => turn SQL queries within tables you have into very short scripts using R packages such as tidyr, dplyr, etc.

3. Data Cleaning => missing values? inconsistency? R has packages that assist in this. It's tedious but necessary to know how to do.

The power in R is the extra packages written by people who are passionate about the ecosystem of options.

Benp404
Posts: 28
Joined: Wed Oct 14, 2015 7:17 pm

Re: Anyone feel like talking "R"

Post by Benp404 » Thu Jan 14, 2016 11:01 am

Data analysis is the big topic... I have some code to draw in 332 .csv files and would like to understand the list.files; list. I was trying to compare the list.files function to data.frame or data.matrix file. What does the ______._________ mean? I am kinda basic right now. :beer

jridger2011
Posts: 458
Joined: Sun Feb 06, 2011 4:17 pm

Re: Anyone feel like talking "R"

Post by jridger2011 » Thu Jan 14, 2016 11:20 am

Benp404 wrote:Data analysis is the big topic... I have some code to draw in 332 .csv files and would like to understand the list.files; list. I was trying to compare the list.files function to data.frame or data.matrix file. What does the ______._________ mean? I am kinda basic right now. :beer
What type of files are these? 332 separate files of the same type with same headers? Are you trying to merge all 332 into a giant data frame?

JZinCO
Posts: 168
Joined: Fri Mar 20, 2015 6:32 pm

Re: Anyone feel like talking "R"

Post by JZinCO » Thu Jan 14, 2016 11:48 am

jridger2011 wrote:I feel like talking R. Let me know what topics would be fun to discuss.

There are some paths that are fun to explore:

1. Data Analysis => turn your old SAS code into R. If stats is not something you do day to day, move to option 2.
ughhhh that is something I am doing by hand. Over the past few weeks a colleague has been sending me blocks of SAS code. I am simultaneously learning the methodology behind what the code blocks do while writing up a script so that we can implement the methdology for an annual report.

JZinCO
Posts: 168
Joined: Fri Mar 20, 2015 6:32 pm

Re: Anyone feel like talking "R"

Post by JZinCO » Thu Jan 14, 2016 12:02 pm

jridger2011 wrote:
Benp404 wrote:Data analysis is the big topic... I have some code to draw in 332 .csv files and would like to understand the list.files; list. I was trying to compare the list.files function to data.frame or data.matrix file. What does the ______._________ mean? I am kinda basic right now. :beer
What type of files are these? 332 separate files of the same type with same headers? Are you trying to merge all 332 into a giant data frame?
list.files is like typing dir in a windows cmd. It just says what is in the directory.
If you want to import all of the csvs into a df then you could type

allcsvfilesconcat <- data.frame ()
for ( i in list.files()) { #assumes you have already set your working dir
csv.temp <- read.table (i, sep=',',header=T) #read in each csv one by one
allcsvfilesconcat <- rbind(allcsvfilesconcat,csv.temp) #Add contents of csv files to each prior csv
}
This works only if you have formatted your columns similarly across csv files.
Probably a better way to do this (an apply function?). I am lazy and use inefficient loops :)

User avatar
parsi1
Posts: 281
Joined: Tue May 29, 2012 8:03 am

Re: Anyone feel like talking "R"

Post by parsi1 » Thu Jan 14, 2016 12:44 pm

sorry I am an old guy, I like fortran and still sometimes write coded in fortran for data processing

kazper
Posts: 623
Joined: Fri Aug 01, 2014 7:45 pm

Re: Anyone feel like talking "R"

Post by kazper » Thu Jan 14, 2016 7:01 pm

Sas user here, although I have wanted to learn r for a while. What are the advantages?

jridger2011
Posts: 458
Joined: Sun Feb 06, 2011 4:17 pm

Re: Anyone feel like talking "R"

Post by jridger2011 » Thu Jan 14, 2016 7:12 pm

kazper wrote:Sas user here, although I have wanted to learn r for a while. What are the advantages?
I think the main advantage is being free to use. I still have a SAS book from the official course I took years back. I haven't opened it in years because I don't have a copy at home to use. Some workplaces restrict SAS to only critical daily users because of the cost.

Also, with R there are several ways through Packages, to solve a problem. Some Packages are designed a certain way to do some things extremely well, so writing a lot of code isn't necessary. Try it out and see.

Youtube: Installing RStudio

JZinCO
Posts: 168
Joined: Fri Mar 20, 2015 6:32 pm

Re: Anyone feel like talking "R"

Post by JZinCO » Thu Jan 14, 2016 7:17 pm

kazper wrote:Sas user here, although I have wanted to learn r for a while. What are the advantages?
Open source, free, and more flexibility (e.g. I think the main things you can do with SAS are statistics, business intelligence, and some DB management). SAS is pretty close to SPSS, JMP and such in that regard.
Don't think of R as a statistical package like SAS. Think of R as a programming language written by statisticians. R has a steeper learning curve, and unlike SAS it is much more explicit in terms of inputs and outputs. SAS is nice because its outputs dump everything you think you need, the other stuff you truly need, and then some. I'm not sure about SAS' limitations but R can be limiting, stemming from its roots and how R was coded. It can't parallel process well, can't do calculus well, and it seems to get bogged down when you have dataframes/matrixes with more than 1 million rows or columns. If someone is always running against said limitations, I say they should move on to MATLAB.
I also use R in my consulting work just to make really pretty figures that excel can't match. About the only better plotting program is SigmaPlot which is horribly expensive.

Benp404
Posts: 28
Joined: Wed Oct 14, 2015 7:17 pm

Re: Anyone feel like talking "R"

Post by Benp404 » Fri Jan 15, 2016 9:21 am

JZinCO wrote:
jridger2011 wrote:
Benp404 wrote:Data analysis is the big topic... I have some code to draw in 332 .csv files and would like to understand the list.files; list. I was trying to compare the list.files function to data.frame or data.matrix file. What does the ______._________ mean? I am kinda basic right now. :beer
What type of files are these? 332 separate files of the same type with same headers? Are you trying to merge all 332 into a giant data frame?
list.files is like typing dir in a windows cmd. It just says what is in the directory.
If you want to import all of the csvs into a df then you could type

allcsvfilesconcat <- data.frame ()
for ( i in list.files()) { #assumes you have already set your working dir
csv.temp <- read.table (i, sep=',',header=T) #read in each csv one by one
allcsvfilesconcat <- rbind(allcsvfilesconcat,csv.temp) #Add contents of csv files to each prior csv
}
This works only if you have formatted your columns similarly across csv files.
Probably a better way to do this (an apply function?). I am lazy and use inefficient loops :)
The 332 are all separate files with the same column headings. The files are all organized the same.
I can set my working directory by
Path <- "C:/" ## I would use my actual working directory. Then use the list.files pattern

listFiles <- list.files(Path, pattern="\\.csv$", names=T) #i am working to understand this part of the code.
## https://gist.github.com/kfeoktistoff/9f ... 5d8496b209 - this is interesting (this is an answer to the assignment i am working on. The assignment is write three functions. The github link shows the last questions first. So the question I am working on is at the bottom of the page. )

Is there any way to use the for(loop) and define my working directory? The idea is to write a function in the end that gives me the ability to sort the data set into csv files 10:30 and calculate the mean of sulfate (column [2], or nitrate, column[3])

Benp404
Posts: 28
Joined: Wed Oct 14, 2015 7:17 pm

Re: Anyone feel like talking "R"

Post by Benp404 » Fri Jan 15, 2016 2:21 pm

stgrimes wrote:I'm starting to look at the PerformanceAnalytics package in R. I've been enjoying this series from Capital Spectator:

http://www.capitalspectator.com/portfol ... -analysis/

I could be convinced to work through that class with you!

Hi,
This is a great offer for me!!! When can we start?

I just figured out lapply() is a function in r that takes a vector and runs a function over that vector. Here is how I use the function to read my working directory:

path = ("C:/User/Gateway/Benp404/specdata")
listFiles <- list.files(path=path, pattern="\\.csv$", full.names=T)
allFiles <- lapply(listFiles, read.csv, header=T)

As far as writing my own function to do what is asked of me in class I am a little lost.

I hope to hear from you!!!

User avatar
DartThrower
Posts: 849
Joined: Wed Mar 11, 2009 4:10 pm
Location: Philadelphia

Re: Anyone feel like talking "R"

Post by DartThrower » Tue Jun 27, 2017 1:56 pm

Does anyone have an opinion on the advantages/disadvantages of R versus EXCEL? The most compelling thing I have read in blogs is that R source code is reproducible meaning (I think) that the same code can be run on a variety of different data sets. And R shows the data and analysis separately which may make it easier to follow the logic of what is happening in an analysis from start to finish.

R has a huge variety of packages available to any of a number of data analysis tasks. But how do I know that the packages are any good i.e. bug free and properly tested? Excel functions are provided by Microsoft and I would have more faith in their accuracy given what I know at this point in time.

In research papers that employ data analysis, my understanding is that SAS is the gold standard. Reviewers want to see that SAS was used to do the analysis. Is opensource software like R suitable for this purpose?

I have consistently read that R has excellent graphics. They are preferable to Excel, but this consideration in itself would not be enough to sway me.
A Boglehead can stay the course longer than the market can stay irrational.

Sourc3
Posts: 74
Joined: Wed Aug 05, 2015 4:45 pm

Re: Anyone feel like talking "R"

Post by Sourc3 » Tue Jun 27, 2017 3:24 pm

I also took a series of Coursera R classes but ended up not using it for work after I took them so definitely very rusty. I would be interested in creating an open-source set of tools for financial analysis for Bogleheads to use.

Since R is open source and cross-platform, I think it could a useful tool for everyone here on the board regardless of whether or not they can afford to pay for Microsoft Excel. If anyone is interested or already has stared something similar, I would be happy to contribute.

alex_686
Posts: 4031
Joined: Mon Feb 09, 2015 2:39 pm

Re: Anyone feel like talking "R"

Post by alex_686 » Tue Jun 27, 2017 3:58 pm

DartThrower wrote:Does anyone have an opinion on the advantages/disadvantages of R versus EXCEL?
It is like asking about hammers verse screwdrivers. They are different tools.

I have deep experience with Excel but I am working on learning R. For myself, Excel is easy to learn and flexible. R is for data analysis of big sets.

I will say this, if given a choice between Excel and R I would chose R. R is a statistical package that is meant to interface with data sets. This means there is a robustness to the output. You can get that in Excel but not necessarily. I have seen Excel do some incredible things - like acting like a relational database. However I have always found when used as such it tends to have a rickety quality to it. Some user will come along and add or modify something and the whole thing falls apart.

User avatar
Abe
Posts: 1807
Joined: Fri Sep 18, 2009 5:24 pm
Location: Earth in the Milky Way Galaxy

Re: Anyone feel like talking "R"

Post by Abe » Tue Jun 27, 2017 4:15 pm

I was going to fast around the curve and my car went R-borne.
Slow and steady wins the race.

User avatar
DartThrower
Posts: 849
Joined: Wed Mar 11, 2009 4:10 pm
Location: Philadelphia

Re: Anyone feel like talking "R"

Post by DartThrower » Tue Jun 27, 2017 8:20 pm

alex_686 wrote:
I will say this, if given a choice between Excel and R I would chose R. R is a statistical package that is meant to interface with data sets. This means there is a robustness to the output. You can get that in Excel but not necessarily. I have seen Excel do some incredible things - like acting like a relational database. However I have always found when used as such it tends to have a rickety quality to it. Some user will come along and add or modify something and the whole thing falls apart.
Thanks Alex. I have seen some rickety worksheets at my workplace too. One of Excel's strengths is also one of its weaknesses. It's easy to slap together a worksheet but it is also easy to put together something that lends itself to becomming corrupt with bad data or errant mouse clicks. I have to confess that I don't know enough about R to know its weaknesses.
A Boglehead can stay the course longer than the market can stay irrational.

User avatar
JupiterJones
Posts: 2702
Joined: Tue Aug 24, 2010 3:25 pm
Location: Nashville, TN

Re: Anyone feel like talking "R"

Post by JupiterJones » Wed Jun 28, 2017 1:09 pm

DartThrower wrote:Does anyone have an opinion on the advantages/disadvantages of R versus EXCEL? The most compelling thing I have read in blogs is that R source code is reproducible meaning (I think) that the same code can be run on a variety of different data sets. And R shows the data and analysis separately which may make it easier to follow the logic of what is happening in an analysis from start to finish.

R has a huge variety of packages available to any of a number of data analysis tasks. But how do I know that the packages are any good i.e. bug free and properly tested? Excel functions are provided by Microsoft and I would have more faith in their accuracy given what I know at this point in time.

In research papers that employ data analysis, my understanding is that SAS is the gold standard. Reviewers want to see that SAS was used to do the analysis. Is opensource software like R suitable for this purpose?

I have consistently read that R has excellent graphics. They are preferable to Excel, but this consideration in itself would not be enough to sway me.
As a frequent user of both, here are my thoughts:

You can do a lot in Excel, and fairly quickly. It's my go-to for slapping together a quick chart or table and emailing it to someone, etc. While Excel's charting is pretty lousy if you just go with the defaults, it can be made to look very good with some tweaking. Plus Excel fits in well with the whole Microsoft ecosystem. If your project involves other Office apps, Excel may be the sensible tool for it.

But Excel is a general-purpose spreadsheet application and not, at its core, a statistics program. Yes, you can use Excel to do means, medians, correlations, and even a simple univariate linear regression. But statistics is R's raison d’être, and it simply punches at a much higher weight than Excel in that arena. If you're doing more complicated regressions, machine learning, time series analysis, etc., R is the way to go. (Yes, there are Excel add-ons that give you some impressive functionality, but I figure that by the time you learn how to wangle them, you might has well have just learned R!)

Not being a researcher, I can't speak to what reviewers of papers want to see. But I'm skeptical that there's any requirement or even preference for SAS these days. R is very popular in academic settings as far as I can tell. According to at least one report SAS usage seems to be on the decline in general, with R (and Python--which is another option to consider) gaining ground every year.

Another R advantage, as you mentioned, are the packages. So, so many of them! Are all of them bug free and tested? No, but most of them are. Remember, the really popular packages are used by tons of people all over the world. If there are bugs, they tend to get found pretty quickly! And, since the packages are open-source, the bugs tend to get fixed quickly. Besides, Microsoft isn't always perfect either. ;-)

But what makes R the big winner for me is the scripting. In Excel, you typically perform a process to create the end result. With R, you typically write the process to create the end result. Which means you--or someone else who wants to verify your results--can re-run it at any time. And, as you mentioned, you can re-run it with a new set of data, letting you easily update a sales forecast, score a new batch of customers, etc. (You can achieve some level of reproducability and automation in Excel via macros/VBA, but it would be far more painful than R.)

And with RMarkdown, you can actually write your paper (or slidedeck) in R, with the underlying code that creates the results actually embedded in the document itself! Rather than having R scripts that create charts and provide results, which you then collect and manually put into a separately-created document, with RMarkdown it's all one single thing, which is enormously cool.
Stay on target...

User avatar
DartThrower
Posts: 849
Joined: Wed Mar 11, 2009 4:10 pm
Location: Philadelphia

Re: Anyone feel like talking "R"

Post by DartThrower » Mon Jul 17, 2017 3:19 pm

JupiterJones wrote:And with RMarkdown, you can actually write your paper (or slidedeck) in R, with the underlying code that creates the results actually embedded in the document itself! Rather than having R scripts that create charts and provide results, which you then collect and manually put into a separately-created document, with RMarkdown it's all one single thing, which is enormously cool.
Jupiter,
Thank you for that enormously helpful response! I especially liked the part about R Markdown. I am now taking a coursera course in R Programming and plan to make a real commitment to the language. My employer will likely not renew SAS when the license is up for renewal in a few months, and I just can't imagine doing everything in EXCEL.

In the 80s I used BASIC, APL and some Fortran and SAS. By the end of the 90s I was using all SAS. Now after all this time it looks like I will use yet another new language before my career is over. It's exciting to learn new things, but I feel like I need to take several steps backwards to do it. :?

One big advantage about this new era is that there is a huge amount of help available online for anyone who has questions on R. I work in an office where nobody knows what R is let alone anything about it, so I will need to use online resources extensively.
A Boglehead can stay the course longer than the market can stay irrational.

Limoncello402
Posts: 108
Joined: Wed May 03, 2017 3:58 pm

Re: Anyone feel like talking "R"

Post by Limoncello402 » Mon Jul 17, 2017 4:08 pm

I clicked on this thinking the big "R" was for Retirement. Shows you what I know :happy

User avatar
BolderBoy
Posts: 4173
Joined: Wed Apr 07, 2010 12:16 pm
Location: Colorado

Re: Anyone feel like talking "R"

Post by BolderBoy » Mon Jul 17, 2017 4:11 pm

Epsilon Delta wrote:Or worse, Visual Basic in Visual Basic.
Hey! I wrote a medical billing suite in VB. And a local hospital is still using my VB-coded patient tracking software because it beats their built-in alternative.

But I know nothing about R.
"Never underestimate one's capacity to overestimate one's abilities" - The Dunning-Kruger Effect

Dottie57
Posts: 4787
Joined: Thu May 19, 2016 5:43 pm

Re: Anyone feel like talking "R"

Post by Dottie57 » Mon Jul 17, 2017 4:18 pm

David Jay wrote:
Epsilon Delta wrote:
David Jay wrote:Sorry, Ben.

I'm an old "C" guy. I can even write "C" in C++ or C# ;)
At least it's not Fortran in C++. Or worse, Visual Basic in Visual Basic.
Hey, Fortran IV was my first programming class in 1976. MA306 (i.e. math department, there was no such thing as CS).

[edit] sorry, way off topic...
My first work language was Fortran 77

User avatar
David Jay
Posts: 5773
Joined: Mon Mar 30, 2015 5:54 am
Location: Michigan

Re: Anyone feel like talking "R"

Post by David Jay » Mon Jul 17, 2017 8:39 pm

Dottie57 wrote:
David Jay wrote:
Epsilon Delta wrote:
David Jay wrote:Sorry, Ben.

I'm an old "C" guy. I can even write "C" in C++ or C# ;)
At least it's not Fortran in C++. Or worse, Visual Basic in Visual Basic.
Hey, Fortran IV was my first programming class in 1976. MA306 (i.e. math department, there was no such thing as CS).

[edit] sorry, way off topic...
My first work language was Fortran 77
...just a kid. :happy
Prediction is very difficult, especially about the future - Niels Bohr | To get the "risk premium", you really do have to take the risk - nisiprius

daveydoo
Posts: 1564
Joined: Sun May 15, 2016 1:53 am

Re: Anyone feel like talking "R"

Post by daveydoo » Mon Jul 17, 2017 8:45 pm

Benp404 wrote:Anyone have any insight in to the software that would help or feel like discussing what some of the code does?
Totally jealous -- trying to get my kids to learn R (on their own) since it's been invaluable for the sharp analysts I work with. But it's too hard for me to just learn it in a vacuum.
"I mean, it's one banana, Michael...what could it cost? Ten dollars?"

trojans10
Posts: 11
Joined: Fri Sep 16, 2016 5:13 am

Re: Anyone feel like talking "R"

Post by trojans10 » Tue Jul 18, 2017 2:09 am

i just learned a bit of R, i like the way they have packages for literally everything. im now diving into Python, and not enjoying it as much. i don't come from a programming background, so R kind of was a good fit. shiny app is pretty cool to mess with. visualizations are great as well.

what are some job titles you guys have working with data? im a marketing analyst, a bit stuck in my career for the most part. i like analytics, vizualizations, etc. but don't necessarily want to be a programmer.

User avatar
triceratop
Moderator
Posts: 5753
Joined: Tue Aug 04, 2015 8:20 pm
Location: la la land

Re: Anyone feel like talking "R"

Post by triceratop » Tue Jul 18, 2017 2:54 am

David Jay wrote:
Dottie57 wrote:
David Jay wrote:
Epsilon Delta wrote:
David Jay wrote:Sorry, Ben.

I'm an old "C" guy. I can even write "C" in C++ or C# ;)
At least it's not Fortran in C++. Or worse, Visual Basic in Visual Basic.
Hey, Fortran IV was my first programming class in 1976. MA306 (i.e. math department, there was no such thing as CS).

[edit] sorry, way off topic...
My first work language was Fortran 77
...just a kid. :happy
My last published journal paper was based entirely off a Fortran95 code I wrote, as will my next one. My current research group uses Fortran heavily. It is a good language for some uses. It's funny because my dad used Fortran '66 back in college in the early 70s and I use it in grad school even today. Of course in many ways they are entirely different languages (who could have imagined object oriented Fortran in the 60s!).

It's always fun when the tech whizkids bash C and Fortran these days -- they just don't get how performing these languages can be.

Oh, I used R once. It was nice for manipulating data. If I did that more I would use it. It has my recommendation based on my admittedly limited knowledge.
"To play the stock market is to play musical chairs under the chord progression of a bid-ask spread."

User avatar
DartThrower
Posts: 849
Joined: Wed Mar 11, 2009 4:10 pm
Location: Philadelphia

Re: Anyone feel like talking "R"

Post by DartThrower » Tue Jul 18, 2017 4:46 am

trojans10 wrote:i just learned a bit of R, i like the way they have packages for literally everything. im now diving into Python, and not enjoying it as much. i don't come from a programming background, so R kind of was a good fit. shiny app is pretty cool to mess with. visualizations are great as well.

what are some job titles you guys have working with data? im a marketing analyst, a bit stuck in my career for the most part. i like analytics, vizualizations, etc. but don't necessarily want to be a programmer.
My title is Statistical Programmer. The field of data management and analysis seems to be broken down into 1) Data Science and 2) Predictive Analytics these days, where Data Scientists have a stronger coding background and often deal with less structured data whereas Predictive Analysts do primarily statistical modeling on more structured data. According to this video from Burchworks, data scientists tend to prefer Python whereas predictive analysts often prefer either SAS or R:
https://www.youtube.com/watch?v=xH01u-0drbc
At 5:45 they start discussing data science vs predictive analytics.
A Boglehead can stay the course longer than the market can stay irrational.

Seasonal
Posts: 149
Joined: Sun May 21, 2017 1:49 pm

Re: Anyone feel like talking "R"

Post by Seasonal » Tue Jul 18, 2017 6:25 am

DartThrower wrote:My title is Statistical Programmer. The field of data management and analysis seems to be broken down into 1) Data Science and 2) Predictive Analytics these days, where Data Scientists have a stronger coding background and often deal with less structured data whereas Predictive Analysts do primarily statistical modeling on more structured data. According to this video from Burchworks, data scientists tend to prefer Python whereas predictive analysts often prefer either SAS or R:
https://www.youtube.com/watch?v=xH01u-0drbc
At 5:45 they start discussing data science vs predictive analytics.
I was recently talking with someone doing research that combined neuroscience and physics. He was using Python and MATLAB and was fond of Anaconda for Python.

GLState
Posts: 141
Joined: Wed Feb 15, 2017 10:38 am

Re: Anyone feel like talking "R"

Post by GLState » Tue Jul 18, 2017 8:36 am

I use R to test the ideas, concepts, and portfolios found on the boglehead's forum and elsewhere. In R, it is easy to compare the past risks and returns of the 3 fund, lazy portfolios, Larry portfolio, etc. I can see the effects of adding gold, REITS, or small value to a portfolio. I can test if rebalancing daily, quarterly, or yearly makes a difference. Instead of using Portfolio Visualizer, I can do the same thing R. I can create Morningstar type "Growth of $10,000" charts for funds or portfolios. In short, I can test what others claim to be true.

JD2775
Posts: 242
Joined: Thu Jul 09, 2015 10:47 pm

Re: Anyone feel like talking "R"

Post by JD2775 » Tue Jul 18, 2017 10:12 am

Seasonal wrote:
DartThrower wrote:My title is Statistical Programmer. The field of data management and analysis seems to be broken down into 1) Data Science and 2) Predictive Analytics these days, where Data Scientists have a stronger coding background and often deal with less structured data whereas Predictive Analysts do primarily statistical modeling on more structured data. According to this video from Burchworks, data scientists tend to prefer Python whereas predictive analysts often prefer either SAS or R:
https://www.youtube.com/watch?v=xH01u-0drbc
At 5:45 they start discussing data science vs predictive analytics.
I was recently talking with someone doing research that combined neuroscience and physics. He was using Python and MATLAB and was fond of Anaconda for Python.
I recently took a Python for Data Science class where we downloaded Anaconda and used the Jupyter Notebook for the class. The Jupyter Notebook is great for doing step by step analysis and seeing results.

User avatar
oneleaf
Posts: 2350
Joined: Mon Feb 19, 2007 5:48 pm

Re: Anyone feel like talking "R"

Post by oneleaf » Thu Jul 20, 2017 12:01 am

JD2775 wrote:
I recently took a Python for Data Science class where we downloaded Anaconda and used the Jupyter Notebook for the class. The Jupyter Notebook is great for doing step by step analysis and seeing results.
Jupyter is great. I do a lot of my reports and exploratory analysis in Python using Jupyter, but I also like R-Markdown for a similar experience using R, mainly because the generated reports often look much better in R without much tweaking.

In general, for those getting into data analysis, the two main programming platforms and associated packages to learn.
- Python, with Pandas for data wrangling and Matplotlib (and Seaborn) for plotting, using Jupyter for a nice workflow.
- R, using dplyr for wrangling and ggplot2 for plotting, and using R Studio and particularly R Markdown.

I love Pandas in Python and find it a joy to write, but I find the Jupyter output and matplotlib plots to look pretty ugly. R is often more tedious to write, but R Markdown and ggplot2 can generate some phenomenal reports and plots.

I use both daily but slightly leaning towards R these days.

As for Excel, I think PowerPivot is an incredible add-in for data analysis. DAX has some powerful features. That said, I find Excel to be very unreliable for large datasets. You never know when it is going to choke and become painfully slow. You also never know when your file will get corrupted. With R or Python, I can load much bigger data sets without issues. My Excel data models with less than a million rows of data often choke while R or Python can handle well over 10 million rows of the same type of data with ease on the same PC. It is a world of difference and why I rely on R and Python for most of my work.

User avatar
Epsilon Delta
Posts: 7432
Joined: Thu Apr 28, 2011 7:00 pm

Re: Anyone feel like talking "R"

Post by Epsilon Delta » Sun Jul 23, 2017 9:44 pm

triceratop wrote:several layers of quotes snipped
Epsilon Delta wrote:
At least it's not Fortran in C++. Or worse, Visual Basic in Visual Basic.
It's always fun when the tech whizkids bash C and Fortran these days -- they just don't get how performing these languages can be.
I won't bash Fortran in Fortran, even idiomatic "Fortran" in C++ wouldn't be too bad. But what I see is a mishmash of Fortran and C++ idioms. Things that should be in constructors aren't, things that shouldn't be in constructors are. Constructors are called for unrelated side effects. And then the erstwhile Fortran programmer discovers throw.

Starper
Posts: 105
Joined: Sun Oct 26, 2014 10:02 pm

Re: Anyone feel like talking "R"

Post by Starper » Wed Jan 24, 2018 1:32 pm

For a sas user, does it make more sense to start learning python or R?

JBTX
Posts: 4243
Joined: Wed Jul 26, 2017 12:46 pm

Re: Anyone feel like talking "R"

Post by JBTX » Wed Jan 24, 2018 1:48 pm

David Jay wrote:
Thu Jan 14, 2016 1:25 am
Epsilon Delta wrote:
David Jay wrote:Sorry, Ben.

I'm an old "C" guy. I can even write "C" in C++ or C# ;)
At least it's not Fortran in C++. Or worse, Visual Basic in Visual Basic.
Hey, Fortran IV was my first programming class in 1976. MA306 (i.e. math department, there was no such thing as CS).

[edit] sorry, way off topic...
In early 80s i took fortran and COBOL in college. They still used punch cards. COBOL and punchcards was not fun.

tarmangani
Posts: 106
Joined: Thu Dec 28, 2017 10:14 am

Re: Anyone feel like talking "R"

Post by tarmangani » Wed Jan 24, 2018 2:06 pm

Starper wrote:
Wed Jan 24, 2018 1:32 pm
For a sas user, does it make more sense to start learning python or R?
Depends on what you're trying to accomplish. Myself, I use R for stats and Python for general-purpose coding. E.g., wife had some hosted archive with over a hundred files that she needed to be anonymized/renamed. Python took care of that one. I was trying to simulate different computer adaptive tests. R has a package that basically did everything I wanted (catR), so I didn't have to reinvent a wheel.

User avatar
DartThrower
Posts: 849
Joined: Wed Mar 11, 2009 4:10 pm
Location: Philadelphia

Re: Anyone feel like talking "R"

Post by DartThrower » Thu Mar 29, 2018 1:42 pm

I just thought I would revisit this topic if anyone is interested in discussing. After learning the basics of the language I'm still unsure how to tell which packages are good and which should be avoided.

With SAS you can be pretty confident in any procedure. You just have to know which procedure will accomplish the task you have in mind. In many cases there are several procedures that can accomplish a given task. In this case you just pick the one that does the job most effectively.

Who vetts the R packages? How do we know which packages are most heavily used and therefore likely the most thoroughly debugged? Is there some source that can provide guidance on which packages are safe for newbies?

I have also noticed that some packages like dplyr seem to take over some of the functionality of base R. It can be confusing to a person just learning the language when seeing examples of problems that are solved using dplyr or lubridate because then you don't learn the basics of the language. Just an observation.

Has anyone else here on Bogleheads started learning R or Python since this thread was last active? I continue to plug away with R and I'm very much enjoying the experience of learning something new.
A Boglehead can stay the course longer than the market can stay irrational.

alex_686
Posts: 4031
Joined: Mon Feb 09, 2015 2:39 pm

Re: Anyone feel like talking "R"

Post by alex_686 » Thu Mar 29, 2018 2:19 pm

DartThrower wrote:
Thu Mar 29, 2018 1:42 pm
Has anyone else here on Bogleheads started learning R or Python since this thread was last active? I continue to plug away with R and I'm very much enjoying the experience of learning something new.
I have been working my way though the R classes at edx.org / HarvedX. They are o.k. My one comment is that they teach statistics along with R at the same time. This has some advantages, but I already know my statistics so at times it is a bit slow.

User avatar
DartThrower
Posts: 849
Joined: Wed Mar 11, 2009 4:10 pm
Location: Philadelphia

Re: Anyone feel like talking "R"

Post by DartThrower » Thu Mar 29, 2018 4:05 pm

alex_686 wrote:
Thu Mar 29, 2018 2:19 pm
DartThrower wrote:
Thu Mar 29, 2018 1:42 pm
Has anyone else here on Bogleheads started learning R or Python since this thread was last active? I continue to plug away with R and I'm very much enjoying the experience of learning something new.
I have been working my way though the R classes at edx.org / HarvedX. They are o.k. My one comment is that they teach statistics along with R at the same time. This has some advantages, but I already know my statistics so at times it is a bit slow.
Similar story here. I'm taking a course in causality through Coursera/Penn. The professor makes sure that any tricky R code is well explained. At this stage in my R learning I am glad for the chance this course gives me to practice basic R coding, and at the same time learn some more statistics.
A Boglehead can stay the course longer than the market can stay irrational.

wrongfunds
Posts: 1908
Joined: Tue Dec 21, 2010 3:55 pm

Re: Anyone feel like talking "R"

Post by wrongfunds » Thu Mar 29, 2018 5:07 pm

Am I the only one who thought that you meant something entirely different by the "R" word? I also thought that it seemed odd to be in Consumer Issue instead of in Investing Theories.

taurabora
Posts: 176
Joined: Fri Oct 30, 2009 12:41 pm

Re: Anyone feel like talking "R"

Post by taurabora » Thu Mar 29, 2018 6:01 pm

quantAndHold wrote:
Thu Jan 14, 2016 10:12 am
Not R, but I just finished the Stanford machine learning class on Coursera. It gave me a new appreciation for the joys of MATLAB.
I will forever hate MATLAB, as I was scarred by my first CompSci course at university, which used MATLAB. The class was required for every engineering major and the professor was consistently rated the worst professor at the university for many years.

User avatar
oneleaf
Posts: 2350
Joined: Mon Feb 19, 2007 5:48 pm

Re: Anyone feel like talking "R"

Post by oneleaf » Fri Mar 30, 2018 11:02 am

DartThrower wrote:
Thu Mar 29, 2018 1:42 pm
I have also noticed that some packages like dplyr seem to take over some of the functionality of base R. It can be confusing to a person just learning the language when seeing examples of problems that are solved using dplyr or lubridate because then you don't learn the basics of the language. Just an observation.
Dplyr and lubridate are part of Hadley Wickham Tidyverse. It is almost better to learn the entire paradigm rather than piece by piece. This way, the ways in which it takes over Base R becomes clearer. The best book is available online here: http://r4ds.had.co.nz/

As someone who uses Python and R everyday at work, the only reason why I switched partially from Python back to R for data munging was the Tidyverse. The code is so readable and easy to maintain and share. I used to use Python for munging and R for analysis and plotting. But now, R with the Tidyverse is my go-to for munging.

Base R is still necessary and you will fall back to it with regularity so you are still best served to learn it well. But I also recommend diving fully into the tidyverse and use it as much as possible. Besides ease of readability, it is often faster. Dplyr’s joins are much faster than R’s merge.

Post Reply