在一个项目里面有这么一个技术需求:
1.集合中元素个数,10M
2.根据上限和下限从一个Set中过滤出满足要求的元素集合.
实际这个是个很典型的技术要求, 之前的项目也遇见过,但是因为当时的类库不多, 都是直接手写实现的. 方式基本等同于第一个方式.
在这个过程中, 我写了四个方式, 基本记录到下面.
第一个方式:对Set进行迭代器遍历, 判断每个元素是否都在上限和下限范围中.如果满足则添加到结果集合中, 最后返回结果集合.
测试效果:集合大小100K, 运算时间 3000ms+
过滤部分的逻辑如下:
测试效果: 集合大小10M, 运算时间: 12000ms+(获得TreeSet) , 200ms(获得结果)
过滤部分的逻辑如下(非常繁琐):
测试效果:集合大小10M,过滤结果1M, 运算时间: 1000ms+
过滤部分的代码如下:
第四种方式:使用Guava(google Collections), 直接对于原始Set进行Filter
测试效果:集合大小10M,过滤结果1M, 运算时间: 100ms-
过滤部分的代码如下:
四种方式对比如下:
第一种方式: 仅依赖于JAVA原生类库 遍历时间最慢, 代码量很小
第二种方式: 仅依赖于JAVA原生类库 遍历时间比较慢(主要慢在生成有序Set), 代码量最多
第三种方式: 依赖于Apache Commons Collections, 遍历时间比较快, 代码量很少
第四种方式: 依赖于Guava, 遍历时间最快, 代码量很少
基于目前个人的技术水平和视野, 第四种方式可能是最佳选择.
记录一下, 以后可能还会有更好的方案.
1.集合中元素个数,10M
2.根据上限和下限从一个Set中过滤出满足要求的元素集合.
实际这个是个很典型的技术要求, 之前的项目也遇见过,但是因为当时的类库不多, 都是直接手写实现的. 方式基本等同于第一个方式.
在这个过程中, 我写了四个方式, 基本记录到下面.
第一个方式:对Set进行迭代器遍历, 判断每个元素是否都在上限和下限范围中.如果满足则添加到结果集合中, 最后返回结果集合.
测试效果:集合大小100K, 运算时间 3000ms+
过滤部分的逻辑如下:
1 void filerSet(Set<BigDecimal> targetSet, String lower, String higher) {
2 BigDecimal bdLower = new BigDecimal(Double.parseDouble(lower));
3 BigDecimal bdHigher = new BigDecimal(Double.parseDouble(higher));
4
5 Set<BigDecimal> returnSet = new HashSet<BigDecimal>();
6 for (BigDecimal object : targetSet) {
7 if (isInRange(object, bdLower, bdHigher)) {
8 returnSet.add(object);
9 }
10 }
11 }
12
13 private boolean isInRange(BigDecimal object, BigDecimal bdLower,
14 BigDecimal bdHigher) {
15 return object.compareTo(bdLower) >= 0
16 && object.compareTo(bdHigher) <= 0;
17 }
第二个方式: 借助TreeSet, 原始集合进行排序, 然后直接subset. 2 BigDecimal bdLower = new BigDecimal(Double.parseDouble(lower));
3 BigDecimal bdHigher = new BigDecimal(Double.parseDouble(higher));
4
5 Set<BigDecimal> returnSet = new HashSet<BigDecimal>();
6 for (BigDecimal object : targetSet) {
7 if (isInRange(object, bdLower, bdHigher)) {
8 returnSet.add(object);
9 }
10 }
11 }
12
13 private boolean isInRange(BigDecimal object, BigDecimal bdLower,
14 BigDecimal bdHigher) {
15 return object.compareTo(bdLower) >= 0
16 && object.compareTo(bdHigher) <= 0;
17 }
测试效果: 集合大小10M, 运算时间: 12000ms+(获得TreeSet) , 200ms(获得结果)
过滤部分的逻辑如下(非常繁琐):
1 Set<BigDecimal> getSubSet(TreeSet<BigDecimal> targetSet, String lower,
2 String higher) {
3
4 BigDecimal bdLower = new BigDecimal(Double.parseDouble(lower));
5 BigDecimal bdHigher = new BigDecimal(Double.parseDouble(higher));
6
7 if ((bdHigher.compareTo(targetSet.first()) == -1)
8 || (bdLower.compareTo(targetSet.last()) == 1)) {
9 return null;
10 }
11
12 boolean hasLower = targetSet.contains(bdLower);
13 boolean hasHigher = targetSet.contains(bdHigher);
14 if (hasLower) {
15 if (hasHigher) {
16 System.out.println("get start:" + bdLower);
17 System.out.println("get end:" + bdHigher);
18 return targetSet.subSet(bdLower, true, bdHigher, true);
19 } else {
20 BigDecimal newEnd = null;
21 System.out.println("get start:" + bdLower);
22 SortedSet<BigDecimal> returnSet = null;
23 if (bdHigher.compareTo(targetSet.last()) != -1) {
24 newEnd = targetSet.last();
25 } else {
26 SortedSet<BigDecimal> newTargetSet = targetSet
27 .tailSet(bdLower);
28 for (BigDecimal object : newTargetSet) {
29 if (object.compareTo(bdHigher) == 1) {
30 newEnd = object;
31 break;
32 } else if (object.compareTo(bdHigher) == 0) {
33 newEnd = object;
34 break;
35 }
36 }
37 }
38 returnSet = targetSet.subSet(bdLower, true, newEnd, true);
39 if (newEnd.compareTo(bdHigher) == 1) {
40 returnSet.remove(newEnd);
41 }
42 return returnSet;
43 }
44
45 } else {
46 if (hasHigher) {
47 System.out.println("get end:" + bdHigher);
48 TreeSet<BigDecimal> newTargetSet = (TreeSet<BigDecimal>) targetSet
49 .headSet(bdHigher, true);
50 BigDecimal newStart = null;
51 SortedSet<BigDecimal> returnSet = null;
52
53 if (bdLower.compareTo(targetSet.first()) == -1) {
54 newStart = targetSet.first();
55 } else {
56 for (BigDecimal object : newTargetSet) {
57 if (object.compareTo(bdLower) != -1) {
58 newStart = object;
59 break;
60 }
61 }
62 }
63 returnSet = targetSet.subSet(newStart, true, bdHigher, true);
64
65 return returnSet;
66 } else {
67 System.out.println("Not get start:" + bdLower);
68 System.out.println("Not get end:" + bdHigher);
69 BigDecimal newStart = null;
70 BigDecimal newEnd = null;
71 if (bdHigher.compareTo(targetSet.last()) != -1) {
72 newEnd = targetSet.last();
73 }
74 if (bdLower.compareTo(targetSet.first()) == -1) {
75 newStart = targetSet.first();
76 }
77 for (BigDecimal object : targetSet) {
78 if (newStart == null) {
79 if (object.compareTo(bdLower) != -1) {
80 newStart = object;
81 if (newEnd != null) {
82 break;
83 }
84 }
85 }
86
87 if (newEnd == null) {
88 if (object.compareTo(bdHigher) != -1) {
89 newEnd = object;
90 if (newStart != null) {
91 break;
92 }
93 }
94 }
95 }
96
97 if (newStart == null) {
98 if (newEnd == null) {
99 if ((bdHigher.compareTo(targetSet.first()) == -1)
100 || (bdLower.compareTo(targetSet.last()) == 1)) {
101 return null;
102 }
103 return targetSet;
104 } else {
105 SortedSet<BigDecimal> newTargetSet = targetSet.headSet(
106 newEnd, true);
107 if (newEnd.compareTo(bdHigher) == 1) {
108 newTargetSet.remove(newEnd);
109 }
110 return newTargetSet;
111 }
112 } else {
113 if (newEnd == null) {
114 SortedSet<BigDecimal> newTargetSet = targetSet.tailSet(
115 newStart, true);
116 return newTargetSet;
117 } else {
118 SortedSet<BigDecimal> newTargetSet = targetSet.subSet(
119 newStart, true, newEnd, true);
120 if (newEnd.compareTo(bdHigher) == 1) {
121 newTargetSet.remove(newEnd);
122 }
123 return newTargetSet;
124 }
125 }
126 }
127 }
128 }
第三种方式: 使用Apache Commons Collections, 直接对于原始Set进行filter. 2 String higher) {
3
4 BigDecimal bdLower = new BigDecimal(Double.parseDouble(lower));
5 BigDecimal bdHigher = new BigDecimal(Double.parseDouble(higher));
6
7 if ((bdHigher.compareTo(targetSet.first()) == -1)
8 || (bdLower.compareTo(targetSet.last()) == 1)) {
9 return null;
10 }
11
12 boolean hasLower = targetSet.contains(bdLower);
13 boolean hasHigher = targetSet.contains(bdHigher);
14 if (hasLower) {
15 if (hasHigher) {
16 System.out.println("get start:" + bdLower);
17 System.out.println("get end:" + bdHigher);
18 return targetSet.subSet(bdLower, true, bdHigher, true);
19 } else {
20 BigDecimal newEnd = null;
21 System.out.println("get start:" + bdLower);
22 SortedSet<BigDecimal> returnSet = null;
23 if (bdHigher.compareTo(targetSet.last()) != -1) {
24 newEnd = targetSet.last();
25 } else {
26 SortedSet<BigDecimal> newTargetSet = targetSet
27 .tailSet(bdLower);
28 for (BigDecimal object : newTargetSet) {
29 if (object.compareTo(bdHigher) == 1) {
30 newEnd = object;
31 break;
32 } else if (object.compareTo(bdHigher) == 0) {
33 newEnd = object;
34 break;
35 }
36 }
37 }
38 returnSet = targetSet.subSet(bdLower, true, newEnd, true);
39 if (newEnd.compareTo(bdHigher) == 1) {
40 returnSet.remove(newEnd);
41 }
42 return returnSet;
43 }
44
45 } else {
46 if (hasHigher) {
47 System.out.println("get end:" + bdHigher);
48 TreeSet<BigDecimal> newTargetSet = (TreeSet<BigDecimal>) targetSet
49 .headSet(bdHigher, true);
50 BigDecimal newStart = null;
51 SortedSet<BigDecimal> returnSet = null;
52
53 if (bdLower.compareTo(targetSet.first()) == -1) {
54 newStart = targetSet.first();
55 } else {
56 for (BigDecimal object : newTargetSet) {
57 if (object.compareTo(bdLower) != -1) {
58 newStart = object;
59 break;
60 }
61 }
62 }
63 returnSet = targetSet.subSet(newStart, true, bdHigher, true);
64
65 return returnSet;
66 } else {
67 System.out.println("Not get start:" + bdLower);
68 System.out.println("Not get end:" + bdHigher);
69 BigDecimal newStart = null;
70 BigDecimal newEnd = null;
71 if (bdHigher.compareTo(targetSet.last()) != -1) {
72 newEnd = targetSet.last();
73 }
74 if (bdLower.compareTo(targetSet.first()) == -1) {
75 newStart = targetSet.first();
76 }
77 for (BigDecimal object : targetSet) {
78 if (newStart == null) {
79 if (object.compareTo(bdLower) != -1) {
80 newStart = object;
81 if (newEnd != null) {
82 break;
83 }
84 }
85 }
86
87 if (newEnd == null) {
88 if (object.compareTo(bdHigher) != -1) {
89 newEnd = object;
90 if (newStart != null) {
91 break;
92 }
93 }
94 }
95 }
96
97 if (newStart == null) {
98 if (newEnd == null) {
99 if ((bdHigher.compareTo(targetSet.first()) == -1)
100 || (bdLower.compareTo(targetSet.last()) == 1)) {
101 return null;
102 }
103 return targetSet;
104 } else {
105 SortedSet<BigDecimal> newTargetSet = targetSet.headSet(
106 newEnd, true);
107 if (newEnd.compareTo(bdHigher) == 1) {
108 newTargetSet.remove(newEnd);
109 }
110 return newTargetSet;
111 }
112 } else {
113 if (newEnd == null) {
114 SortedSet<BigDecimal> newTargetSet = targetSet.tailSet(
115 newStart, true);
116 return newTargetSet;
117 } else {
118 SortedSet<BigDecimal> newTargetSet = targetSet.subSet(
119 newStart, true, newEnd, true);
120 if (newEnd.compareTo(bdHigher) == 1) {
121 newTargetSet.remove(newEnd);
122 }
123 return newTargetSet;
124 }
125 }
126 }
127 }
128 }
测试效果:集合大小10M,过滤结果1M, 运算时间: 1000ms+
过滤部分的代码如下:
1 //过滤的主体逻辑
2 void filterSet(Set<BigDecimal> targetSet, String lower, String higher) {
3 final BigDecimal bdLower = new BigDecimal(Double.parseDouble(lower));
4 final BigDecimal bdHigher = new BigDecimal(Double.parseDouble(higher));
5
6 Predicate predicate = new Predicate() {
7 public boolean evaluate(Object object) {
8 BigDecimal bDObject = (BigDecimal) object;
9 return bDObject.compareTo(bdLower) >= 0
10 && bDObject.compareTo(bdHigher) <= 0;
11 }
12 };
13
14 CollectionUtils.filter(targetSet, predicate);
15 }
2 void filterSet(Set<BigDecimal> targetSet, String lower, String higher) {
3 final BigDecimal bdLower = new BigDecimal(Double.parseDouble(lower));
4 final BigDecimal bdHigher = new BigDecimal(Double.parseDouble(higher));
5
6 Predicate predicate = new Predicate() {
7 public boolean evaluate(Object object) {
8 BigDecimal bDObject = (BigDecimal) object;
9 return bDObject.compareTo(bdLower) >= 0
10 && bDObject.compareTo(bdHigher) <= 0;
11 }
12 };
13
14 CollectionUtils.filter(targetSet, predicate);
15 }
第四种方式:使用Guava(google Collections), 直接对于原始Set进行Filter
测试效果:集合大小10M,过滤结果1M, 运算时间: 100ms-
过滤部分的代码如下:
1 //guava filter
2
3 Set<BigDecimal> filterSet(Set<BigDecimal> targetSet, String lower,
4 String higher) {
5 final BigDecimal bdLower = new BigDecimal(Double.parseDouble(lower));
6 final BigDecimal bdHigher = new BigDecimal(Double.parseDouble(higher));
7
8 Set<BigDecimal> filterCollection = Sets.filter(targetSet,
9 new Predicate<BigDecimal>() {
10 @Override
11 public boolean apply(BigDecimal input) {
12 BigDecimal bDObject = (BigDecimal) input;
13 return bDObject.compareTo(bdLower) >= 0
14 && bDObject.compareTo(bdHigher) <= 0;
15 }
16 });
17
18 return filterCollection;
19 }
2
3 Set<BigDecimal> filterSet(Set<BigDecimal> targetSet, String lower,
4 String higher) {
5 final BigDecimal bdLower = new BigDecimal(Double.parseDouble(lower));
6 final BigDecimal bdHigher = new BigDecimal(Double.parseDouble(higher));
7
8 Set<BigDecimal> filterCollection = Sets.filter(targetSet,
9 new Predicate<BigDecimal>() {
10 @Override
11 public boolean apply(BigDecimal input) {
12 BigDecimal bDObject = (BigDecimal) input;
13 return bDObject.compareTo(bdLower) >= 0
14 && bDObject.compareTo(bdHigher) <= 0;
15 }
16 });
17
18 return filterCollection;
19 }
四种方式对比如下:
第一种方式: 仅依赖于JAVA原生类库 遍历时间最慢, 代码量很小
第二种方式: 仅依赖于JAVA原生类库 遍历时间比较慢(主要慢在生成有序Set), 代码量最多
第三种方式: 依赖于Apache Commons Collections, 遍历时间比较快, 代码量很少
第四种方式: 依赖于Guava, 遍历时间最快, 代码量很少
基于目前个人的技术水平和视野, 第四种方式可能是最佳选择.
记录一下, 以后可能还会有更好的方案.