Computational Statistics is the interface between statistics and computer science. While statistics helps collect, organize, analyse and interpret data, computer science helps in writing algorithms and representing results. There are very wide verity of languages that helps with statistical data analysis in computer science. Let us look at R language available for statistics.
I presently work with R language for Statistical Data Analysis at my work. It's a very neat language and simple to learn and begin with. I had no prior knowledge of either R or Statistical methods. The main use of R for me is to use statistical methods and to do data analysis.
R is an interpreted language and comes with abundant of third-party packages to support variety of statistical computing methods. The R manual can be found here. A list below mentions some of the features that I have encountered so far while working with R language.
1. Descriptive Statistics for R Vectors - Summarizes a given numerical vector with information such as Mean, Mode, Median and Max.
2. Plotting Graphs for representing results - A complex set of data frame (table) can be turned into a graph to better understand the data. R graphs is so much powerful and has lot of features for creating very detailed and neat graphs.
3. Variance analysis - To understand how various variables (vectors) stored in R are related to one another. There are quite a few various methods depending upon type of values variables take - Normal , Log linear and Logistic regression.
4. Multivariate analysis - This technique helps in learning about statistical outcome of more than one variable.
5. Data types - R is very rich in type of data it can consume and data structure it uses. Arrays, Lists, Vectors, Matrix and Data frames in R generally satisfies user's needs to play with variety of data.
Apart from R's usage in statistical data analysis, R is also useful in areas such as Analytics (e.g Machine learning, Statistical Modeling), Graphics and Visualization (Refer to my previous 2 blogs) and Data mining.
I would recommend learning R (or Octave/MATLAB) since it is different than other normal programming languages we know (Java, Python, C etc.) and additionally it is packed with lot of statistical methods which is simple to use and explore. A language falling in the category of Statistical computing looks good on resume and may ultimately help explore new domain.

Hi Mehal,
ReplyDeleteI find this article really interesting. I am supposed to use this language for one of my projects to generate a receiver operating characteristic curve. I am still new to this language trying to figure out stuff. Looks like you have a very good experience with R. Now I really understand why they recommended me to use this language to generate ROC curve. Your article provides a correct justification for why we need to use this language. It is a nice write up! I will have to catch hold of you if I am hit a block while I am working on R.
Nice work!
Great job done keep posting more. I will share your link with my friends.
ReplyDeleteStatistical Data analysis