CS5L13-3——使用分组统计出现频率

本章代码关键字

1
2
3
4
5
group ... by ... into ...        //第一个空传入集合中的单个元素,第二个空传入分组依据(表达式),第三空为分组得到某个组集合,后续可以使用

IEnumerable<>.GroupBy() //传入分组方法,对集合进行分组,返回分组出来的集合
IEnumerable<>.Select() //传入选择方法,返回一个新集合
IEnumerable<>.ToDictionary() //传入两个方法,一个是获取键的方法,一个是获取值的方法,根据这两个方法将集合转换为字典

统计频率

假设有一个长度为200的数组,其中值是随机的,范围在0到19之间

1
2
3
4
Random random = new Random(1334);       //使用种子使数组内部的值是唯一的
var arr = new int[200];
for (int i = 0; i < arr.Length; i++)
arr[i] = random.Next(0, 20); //该数组内的数据都是0~19

不使用LINQ

1
2
3
4
5
6
7
8
9
10
11
12
13
14
static void Normal3(int[] arr)
{
Console.WriteLine("不使用LINQ");
Dictionary<int, int> dict = new(); //用来装载数组值与对应的频率,键为数组值,值为出现次数
foreach (int i in arr)
{
if (dict.ContainsKey(i)) //若存在,则+1
dict[i] += 1;
else
dict.Add(i, 1); //若不存在,则添加键值对
}
foreach (int i in dict.Keys) //全部打印出来
Console.WriteLine(i + " frequency is " + dict[i]);
}

输出结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
12 frequency is 11
4 frequency is 17
5 frequency is 8
9 frequency is 15
6 frequency is 8
13 frequency is 7
0 frequency is 9
3 frequency is 10
1 frequency is 8
18 frequency is 7
14 frequency is 10
2 frequency is 8
15 frequency is 14
16 frequency is 10
8 frequency is 12
10 frequency is 11
17 frequency is 5
11 frequency is 11
7 frequency is 8
19 frequency is 11

分组

group 子句 - C# 参考 - C# | Microsoft Learn

1
2
3
4
5
6
7
8
9
//查询表达式内
var result = from x in arr
group 要分组的元素 by 按照什么分组 into 分组得到的某个组的集合
select new { num = 分组出来的元素.Key, count = 分组出来的元素.Count() };

//连接表达式内
var result = arr
.GroupBy(要分组的元素 => 按照什么分组)
.Select(组集合 => new { num = 组集合.Key, count = 组集合.Count() });

由于通过 groupby​ 或者 .GroupBy()​ 分组出的数据类型是类似于字典内的键值对形式的分组IGrouping<>​类型,

其中,IGrouping<>.key​ 可以得到分组的键(具体内容取决于如何分组),IGrouping<>.Count()​ 可以得到该组的元素数量

因此建议使用 Select()​ 返回匿名类来装载,或者使用 ToDictionary()​ 直接转化为Dictionary<>​类型(仅限链接表达式)

LINQ实例

使用LINQ(查询表达式)

1
2
3
4
5
6
7
8
9
static void QueryExpressionLINQ3(int[] arr)
{
Console.WriteLine("查询表达式LINQ");
var result = from x in arr //从arr内取出所有元素并取名x
group x by x into g //为x的值为x进行分组,装载到g,此时g的类型是类似于键值对形式的
select new { num = g.Key, count = g.Count() }; //使用匿名类来装载,num为键也就是数组的元素,count就是该组的值
foreach (var keyValuePair in result) //全部打印出来
Console.WriteLine(keyValuePair.num + " frequency is " + keyValuePair.count);
}

输出结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
12 frequency is 11
4 frequency is 17
5 frequency is 8
9 frequency is 15
6 frequency is 8
13 frequency is 7
0 frequency is 9
3 frequency is 10
1 frequency is 8
18 frequency is 7
14 frequency is 10
2 frequency is 8
15 frequency is 14
16 frequency is 10
8 frequency is 12
10 frequency is 11
17 frequency is 5
11 frequency is 11
7 frequency is 8
19 frequency is 11

使用LINQ(链式表达式)

1
2
3
4
5
6
7
8
9
10
11
12
13
static void ChainedExpressionLINQ3(int[] arr)
{
Console.WriteLine("链式表达式LINQ");
//对arr使用分组,分组形式为数组元素x通过数组元素x来分组,这时result为分组元素的可迭代集合,
//再挑选,将分组元素g使用匿名类来装载,num为分组名(数组元素),count为该组的值,这时result为匿名类的可迭代集合
var result = arr.GroupBy(x => x)
.Select(g => new { num = g.Key, count = g.Count() });
//你也可以这样写,将分组元素直接转化为Dictionary<>类型
var res = arr.GroupBy(x => x)
.ToDictionary(g => g.Key, g => g.Count());
foreach (var keyValuePair in result)
Console.WriteLine(keyValuePair.num + " frequency is " + keyValuePair.count);
}

输出结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
12 frequency is 11
4 frequency is 17
5 frequency is 8
9 frequency is 15
6 frequency is 8
13 frequency is 7
0 frequency is 9
3 frequency is 10
1 frequency is 8
18 frequency is 7
14 frequency is 10
2 frequency is 8
15 frequency is 14
16 frequency is 10
8 frequency is 12
10 frequency is 11
17 frequency is 5
11 frequency is 11
7 frequency is 8
19 frequency is 11