Sunday, May 10, 2020

R – A TRUTHFUL PROGRAMMING LANGUAGE FOR DATA SCIENTISTS


To innovate and practice algorithms for implementing solutions, analyze unstructured data, To perform statistical computations, data analysis, graphical representation and visualization of data, statistical programming languages play an important role in the day to day work of  Data Scientists. To capture, communicate, store, analyze, and aggregate data manually requires a lot of manual effort, the manual effort has not guaranteed accuracy to build complex calculations. To reduce the manual effort and increase the accuracy in the above operations, statistical programming language plays an important role in modern technology.

R for Data Science


Data science is an empowering control that empowers you to change unrefined data into getting, comprehension, and data. The goal of "R for Data Science" is to help you with learning the most noteworthy devices in R that will empower you to do Data science [1].

Introduction to R.

R is a language and condition for unquestionable figuring and plans. R gives a wide assortment of certain (straight and nonlinear appearing, old-style quantifiable tests, time-game-plan assessment, depiction, gathering, … ) and graphical technique, and is altogether extensible. The S language is a great part of the time the vehicle of a decision to inspect in quantifiable technique, and R gives an Open Source course to the excitement for that action [2]. 


R Environment.

R is an integrated suite of software facilities for data manipulation, calculation and a graphical display. It includes effective data handling and storage the facility, a suite of operators for calculations on arrays, in particular matrices, a large, coherent, integrated collection of intermediate tools for data analysis, graphical facilities for data analysis and display either on-screen or on hardcopy, and a well-developed, simple and effective programming language which includes  conditionals, loops, user-defined recursive functions and input and output facilities [2]. 

R Language.

R is an extremely versatile open-source programming language for statistics and data science [3]. In the R system, you can do any kind of statistical computation by using functional-based syntax or program based code with very powerful debugging facilities and this language has many interfaces to other programming languages. Then the resulting statistics can be displayed by using the high-level graphical tool in R [4]. When data scientists work in any field of big data like data business, industry, and government, you'll find the majority of them using the R environment and packages (comparison between languages will discuss later), even when they work in medicine, academia, and so on. R has the following features [5]: 

        A short and slim syntax to accelerate your tasks on your data. It has a variant format for loading and storing data for both local and over internet tasks. Ability to perform your tasks in memory by using a consistent syntax. A long list of tools (functions, packages) for data analysis tasks, some of them are built-in and the rest is open source. It has different easy manners to represent the statistical results in graphical methods, and the ability to store these graphs on the disk. Ability to automate analyses and create new functions (R is a programming language), and extend the existing language features.

      Users don’t need to reload their data every time because the system saves the data between the sessions, and save the history of their commands. If you prefer GUI, there are many free GUI for R like • RStudio • R Commander • StatET • ESS • JGR Java GUI for R.

Advantages of R.

1. Programmers don’t need to reload data every time, the system saves data between sessions and history of their commands.
2. Supports various formats for storing and loading data for both local and over internet tasks.
3. It's highly compatible and can pair up with different programming languages like C, C++, JAVA, Python.
4. It is easy to integrate with various database management systems and technologies like Hadoop.
5. R is well known as the lingua franca of statistics.
6. Reporting results of the analysis is extremely easy, it also helps to build interactive web apps that allow users to play with results.

 

Disadvantages of R.    

1. R package and a programming language are much slower than other languages like python.
2. R has a lack of basic security, due to this it has several restrictions to embedded into a web application.
3. R requires entire data in a single place, due to this it requires more memory.


R Tool and Programming language have several advantages and disadvantages for data scientists comparable to the other statistical programming languages.


References :
[1] H. Wickham & G.Grolemund, "R for Data Science", January 2017, O’Reilly Media Inc.
[2] Introduction to R. Retrieved from https://www.r-project.org/about.html
[3] W. McKinney, “Python for data analysis”,1st ed., 2013, O’Reilly Media Inc., pp.453.
[4] D. Rotolo and L.Leydesdorff, “Matching Medline / Pubmed data with a web of science: a routine in r language”, vol. 66, no. 10, 2015, Journal of the Association for Information Science and Technology, pp. 2155–2159
[5] T. Siddiqui and M. Al Kadri, “Review of Programming Languages and Tools for Big Data Analytics”, May-June 2017, International Journal of Advanced Research in Computer Science, pp.1113.


2 comments:

  1. m88 casino bonus code, Promo code, Bet $1 and get 2nd
    m88 casino promo code, Bet $1 and william hill get 2nd m88 Chance with bet365 this code for Free $20. Play all the best casino games with a M88 Casino Bonus Code.

    ReplyDelete
  2. The Dream - Casino | Jordan 8 Realty
    Welcome to the buy air jordan 18 retro varsity red Hotel. We air jordan 18 retro men good website are show to buy air jordan 18 retro red suede looking for a welcoming night out. Located in the heart of Israel, our resort bestest air jordan 18 retro men has a Mediterranean-themed atmosphere 1xbet with a

    ReplyDelete