课程表

R语言 基础教程

R语言 图表

R语言 数据接口

R语言 统计示例

工具箱
速查手册

R语言 因子

当前位置:免费教程 » 程序设计 » R语言

因子是用于对数据进行分类并将其存储为级别的数据对象。 它们可以存储字符串和整数。 它们在具有有限数量的唯一值的列中很有用。 像“男性”,“女性”和True,False等。它们在统计建模的数据分析中很有用。

使用factor()函数通过将向量作为输入创建因子。

  1. # Create a vector as input.
  2. data <- c("East","West","East","North","North","East","West","West","West","East","North")
  3.  
  4. print(data)
  5. print(is.factor(data))
  6.  
  7. # Apply the factor function.
  8. factor_data <- factor(data)
  9.  
  10. print(factor_data)
  11. print(is.factor(factor_data))

当我们执行上面的代码,它产生以下结果 -

  1. [1] "East" "West" "East" "North" "North" "East" "West" "West" "West" "East" "North"
  2. [1] FALSE
  3. [1] East West East North North East West West West East North
  4. Levels: East North West
  5. [1] TRUE

数据帧的因子

在创建具有文本数据列的任何数据框时,R语言将文本列视为分类数据并在其上创建因子。

  1. # Create the vectors for data frame.
  2. height <- c(132,151,162,139,166,147,122)
  3. weight <- c(48,49,66,53,67,52,40)
  4. gender <- c("male","male","female","female","male","female","male")
  5.  
  6. # Create the data frame.
  7. input_data <- data.frame(height,weight,gender)
  8. print(input_data)
  9.  
  10. # Test if the gender column is a factor.
  11. print(is.factor(input_data$gender))
  12.  
  13. # Print the gender column so see the levels.
  14. print(input_data$gender)

当我们执行上面的代码,它产生以下结果 -

  1. height weight gender
  2. 1 132 48 male
  3. 2 151 49 male
  4. 3 162 66 female
  5. 4 139 53 female
  6. 5 166 67 male
  7. 6 147 52 female
  8. 7 122 40 male
  9. [1] TRUE
  10. [1] male male female female male female male
  11. Levels: female male

更改级别顺序

可以通过使用新的等级次序再次应用因子函数来改变因子中的等级的顺序。

  1. data <- c("East","West","East","North","North","East","West","West","West","East","North")
  2. # Create the factors
  3. factor_data <- factor(data)
  4. print(factor_data)
  5.  
  6. # Apply the factor function with required order of the level.
  7. new_order_data <- factor(factor_data,levels = c("East","West","North"))
  8. print(new_order_data)

当我们执行上面的代码,它产生以下结果 -

  1. [1] East West East North North East West West West East North
  2. Levels: East North West
  3. [1] East West East North North East West West West East North
  4. Levels: East West North

生成因子级别

我们可以使用gl()函数生成因子级别。 它需要两个整数作为输入,指示每个级别有多少级别和多少次。

语法

  1. gl(n, k, labels)

以下是所使用的参数的说明 -

  • n是给出级数的整数。

  • k是给出复制数目的整数。

  • labels是所得因子水平的标签向量。

  1. v <- gl(3, 4, labels = c("Tampa", "Seattle","Boston"))
  2. print(v)

当我们执行上面的代码,它产生以下结果 -

  1. Tampa Tampa Tampa Tampa Seattle Seattle Seattle Seattle Boston
  2. [10] Boston Boston Boston
  3. Levels: Tampa Seattle Boston
转载本站内容时,请务必注明来自W3xue,违者必究。
 友情链接:直通硅谷  点职佳  北美留学生论坛

本站QQ群:前端 618073944 | Java 606181507 | Python 626812652 | C/C++ 612253063 | 微信 634508462 | 苹果 692586424 | C#/.net 182808419 | PHP 305140648 | 运维 608723728

W3xue 的所有内容仅供测试,对任何法律问题及风险不承担任何责任。通过使用本站内容随之而来的风险与本站无关。
关于我们  |  意见建议  |  捐助我们  |  报错有奖  |  广告合作、友情链接(目前9元/月)请联系QQ:27243702 沸活量
皖ICP备17017327号-2 皖公网安备34020702000426号