1. What is the difference between %% and %|%.

The %% offers a reminder of the diversion of the first vector with the second, and the %|% provides the quotient of the diversion of the first vector with the second.

2. How to use apply() function in R?

The function apply() used to implement the same function to each of the elements in an array. Ex, discovering the mean of the rows in every row.

3. Define aggregate() function

The aggregate() function is applied to aggregate data in R. Two methods are applied to collapse data by using one or more BY variables and the other is an aggregate() function in that BY variable would be in the list.

4. What is the difference between lapply and sapply?

The lapply is applied to display the output in list form while sapply is used to display the output in the form of a data frame or vector.

5. Define DoBy package

DoBy package is used to describe the table using model formula and function.

6. What is the use of the table() function?

The table() function is to generate the frequency in R.

7. Define lattice package

The lattice package is used to improve the base R graphics by providing better defaults and can show multivariate relationships easily.

8. Describe anova() function

The ANOVA() function is used to compare the nested models.

9. What is the leaps() function?

The leaps() function is applied to perform defined under leaps package and all-subsets regression.

10. What is the full form of MANOVA and the use of it

MANOVA is the acronym Multivariate Analysis of Variance and is used to verify more than one dependant variable at the same time.

11. What is the goal of integration of R and Hadoop?

The integration of R and Hadoop is used for executing R code and to access the data stored in Hadoop.

12. Differentiate sample() and subset in R.

The sample() is the method used to select a random sample of size and form a dataset while subset is the method used to select variables and observations.

13. Describe fitdistr() function

The fitdistr() function is used to provide the maximum fitting of the Univariate distribution and comes under the MASS package.

14. What are iPlots and GGobi denote?

The iPlots is a package that provides bar plots, box plots, mosaic plots, histograms, parallel plots, and scatter plots and the GGobi is an open-source program used in visualization to explore high dimensional typed data in R programming.

15. Explain t-test() in R?

The t-test() is the function used to determine the mean of two groups and verify that they are equal or not.

16. What are the integration methods of R to Hadoop?

R Hadoop, Hadoop Streaming, RHIPE, and ORCH are the methods used to integrate R with Hadoop.

17. Describe the random walk model in R?

A random walk model is the simplest way of a non-stationary process and it has no specified mean or variance. It has strong dependence over time and the changes and increments of random walk are white noise and simulating random walk in R.

18. Define white noise model

The white noise model is a basic time series and a simple stationary process. It has a fixed constant variance, fixed constant mean, and no correlation over time. It can be simulated using the following way.

arima.sim(model=list(order=c(0,0,0)), n=50)->wn

19. List out the names of the packages used for data manipulation

MICE, missFores, Mi, Hmisc, Amelia, and imputeR are the popular packages of R used for data manipulation.

20. What is initialize() function in R

Initialize() is the function that is used to initialize the private data members while declaring the object.

