1. 程式人生 > >PostgreSQL 1000億資料量 正則匹配 速度與激情

PostgreSQL 1000億資料量 正則匹配 速度與激情

測試環境為 8臺主機(16c/host)的 PostgreSQL叢集,一共240個數據節點,測試資料量1008億。
效能圖表 :
_
如果要獲得更快的響應速度,可以通過增加主機和節點數(或者通過增加CPU和節點數),縮短recheck的處理時間。

資料生成方法:

#!/bin/bash  
#      擷取通過random()計算得到的MD5 128bit hex的前48bit, 轉成字串,得到[0-9]和[a-f]組成的12個隨機字串。  

psql digoal digoal -c "create table t_regexp_100billion distributed randomly"  

for
((i=1;i<=1008;i++)) do psql digoal digoal -c "copy (select substring(md5(random()::text),1,12) from generate_series(1,100000000)) to stdout" | psql digoal digoal -c "copy t_regexp_100billion from stdin" done psql digoal digoal -c "set maintenance_work_mem='4GB'; create index idx_t_regexp_100billion_1 on t_regexp_100billion(info)"
psql digoal digoal -c "set maintenance_work_mem='4GB'; create index idx_t_regexp_100billion_2 on t_regexp_100billion(reverse(info))" psql digoal digoal -c "set maintenance_work_mem='4GB'; create index idx_t_regexp_100billion_gin on t_regexp_100billion using gin (info gin_trgm_ops)"

資料概貌

digoal=> select count(*) from t_regexp_100billion ;  
    count       
--------------  
 100800000000  
(1 row)  
Time:
228721.386 ms

表大小

digoal=> \dt+ t_regexp_100billion   
                           List of relations  
 Schema |        Name         | Type  | Owner  |  Size   | Description   
--------+---------------------+-------+--------+---------+-------------  
 public | t_regexp_100billion | table | digoal | 4158 GB |   
(1 row)  

索引大小

idx_t_regexp_100billion_1     2961 GB  
idx_t_regexp_100billion_1     2961 GB  
idx_t_regexp_100billion_gin   2300 GB  

測試資料展示:

digoal=> select * from t_regexp_100billion offset 1000000 limit 10;  
     info       
--------------  
 bca0fb45367e  
 3051ca8a9a38  
 fadc91a3a4de  
 710b9c60417e  
 279dd9832cc3  
 f4743fe2e83b  
 9ce9e42d4039  
 65e64742fd3f  
 db3d0e0edc52  
 7cfb00bb38ec  
(10 rows)  

重複度取樣, 計算random() md5得到的字串,可以確保非常低的重複度:

digoal=> select count(distinct info) from (select * from t_regexp_100billion offset 1299422811 limit 1000000) t;  
 count    
--------  
 999750  
(1 row)  

統計資訊展示:

digoal=> alter table t_regexp_100billion alter column info set statistics 10000;  
ALTER TABLE  
digoal=> analyze t_regexp_100billion ;  
ANALYZE  

schemaname             | public  
tablename              | t_regexp_100billion  
attname                | info  
inherited              | f  
null_frac              | 0  
avg_width              | 13  
n_distinct             | -0.836834             # 取樣統計資訊,約83.6834%的唯一值  
most_common_vals       | (pg_catalog.text){7f68d12d2205,00083380706d,00154b6d79e8,...    
most_common_freqs      | {1e-06,6.66667e-07,6.66667e-07,6.66667e-07,.....        單個最高頻值的佔比為1e-06, 也就是說1000億記錄中出現10萬次。  
histogram_bounds       | (pg_catalog.text){0000008123b7,00066c71c9bb,000d672de234,...  
correlation            | 0.000237291  
most_common_elems      |   
most_common_elem_freqs |   
elem_count_histogram   |   

7f68d12d2205 實際的出現次數,可能是取樣時7f68d12d2205被取樣到的塊較多,所以資料庫認為它的佔比較多:

digoal=> select count(*) from t_regexp_100billion where info='7f68d12d2205';  
-[ RECORD 1 ]  
count | 54  

digoal=> select ctid from t_regexp_100billion where info='7f68d12d2205' order by 1;  
     ctid        
---------------  
 (15343,114)  
 (62134,39)  
 (96808,112)  
 (116492,176)  
 (194615,143)  
 (328074,116)  
 (364037,115)  
 (375240,158)  
 (376187,152)  
 (602144,81)  
 (664026,6)  
 (689501,136)  
 (695345,130)  
 (697374,126)  
 (714719,148)  
 (743169,20)  
 (802326,139)  
 (833830,41)  
 (839417,185)  
 (892417,78)  
 (892493,149)  
 (907979,52)  
 (967078,163)  
 (990313,159)  
 (1007998,27)  
 (1106961,57)  
 (1142731,165)  
 (1148427,67)  
 (1156654,156)  
 (1205854,137)  
 (1243429,68)  
 (1277287,165)  
 (1328836,98)  
 (1331727,150)  
 (1337534,3)  
 (1360947,104)  
 (1438970,97)  
 (1476941,22)  
 (1482022,82)  
 (1486307,69)  
 (1548445,155)  
 (1557209,82)  
 (1564980,158)  
 (1646685,76)  
 (1663018,99)  
 (1678604,77)  
 (1755845,177)  
 (1981937,153)  
 (1984723,98)  
 (2071955,59)  
 (2093147,149)  
 (2199794,102)  
 (2204957,44)  
 (2234820,142)  
(54 rows)  

效能測試:
字首匹配查詢速度:

digoal=> select ctid,tableoid,info from t_regexp_100billion where info ~ '^80ebcdd47';  
     ctid      | tableoid |     info       
---------------+----------+--------------  
 (124741,60)   |    16677 | 80ebcdd47006  
 (896121,64)   |    16659 | 80ebcdd47006  
 (1124495,97)  |    16659 | 80ebcdd47006  
 (1126474,141) |    16659 | 80ebcdd47006  
 (1059471,62)  |    16659 | 80ebcdd47006  
 (1296562,115) |    16659 | 80ebcdd47006  
 (1190941,122) |    16659 | 80ebcdd47006  
 (680853,129)  |    16659 | 80ebcdd47006  
 (1010667,15)  |    16659 | 80ebcdd47006  
 (1386348,25)  |    16659 | 80ebcdd47006  
 (1522827,90)  |    16659 | 80ebcdd47006  
 (2204071,129) |    16659 | 80ebcdd47006  
 (1570431,114) |    16659 | 80ebcdd47006  
 (888185,38)   |    16659 | 80ebcdd47006  
 (605886,160)  |    16659 | 80ebcdd47006  
 (1306061,123) |    16659 | 80ebcdd47006  
 (757157,47)   |    16659 | 80ebcdd47006  
 (1166290,83)  |    16659 | 80ebcdd47006  
 (419730,1)    |    16659 | 80ebcdd47006  
 (1833853,131) |    16659 | 80ebcdd47006  
 (964866,120)  |    16659 | 80ebcdd47006  
 (904961,175)  |    16659 | 80ebcdd47006  
 (984373,32)   |    16659 | 80ebcdd47006  
 (891018,145)  |    16659 | 80ebcdd47006  
 (1520483,121) |    16659 | 80ebcdd47006  
 (571001,124)  |    16659 | 80ebcdd47006  
 (802093,55)   |    16659 | 80ebcdd47006  
 (6831,172)    |    16659 | 80ebcdd47006  
 (1169137,84)  |    16659 | 80ebcdd47006  
 (77398,164)   |    16659 | 80ebcdd47006  
 (24132,98)    |    16659 | 80ebcdd47006  
 (564322,152)  |    16659 | 80ebcdd47006  
 (357087,172)  |    16659 | 80ebcdd47006  
 (1823628,60)  |    16659 | 80ebcdd47006  
 (2153609,52)  |    16659 | 80ebcdd47006  
 (816401,140)  |    16659 | 80ebcdd47006  
 (542383,53)   |    16662 | 80ebcdd47006  
 (1340971,64)  |    16662 | 80ebcdd47006  
 (1239166,108) |    16662 | 80ebcdd47006  
 (2033648,39)  |    16662 | 80ebcdd47006  
 (1890808,93)  |    16662 | 80ebcdd47006  
 (1213124,4)   |    16662 | 80ebcdd47006  
 (1025184,106) |    16662 | 80ebcdd47006  
 (620238,131)  |    16662 | 80ebcdd47006  
 (583064,74)   |    16662 | 80ebcdd47006  
 (1454680,42)  |    16671 | 80ebcdd47006  
 (417385,74)   |    16671 | 80ebcdd47006  
 (323669,61)   |    16671 | 80ebcdd47006  
 (1759181,138) |    16671 | 80ebcdd47006  
 (2112157,146) |    16671 | 80ebcdd47006  
 (431326,92)   |    16671 | 80ebcdd47006  
 (2097356,110) |    16671 | 80ebcdd47006  
(52 rows)  
Time: 3226.393 ms  

digoal=> explain (analyze,verbose,buffers,costs,timing) select ctid,tableoid,info from t_regexp_100billion where info ~ '^80ebcdd47';  
 Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0) (actual time=3085.502..3112.273 rows=52 loops=1)  
   Output: t_regexp_100billion.ctid, t_regexp_100billion.tableoid, t_regexp_100billion.info  
   Node/s: h1_data1, h1_data10, h1_data11, h1_data12, h1_data13, h1_data14, h1_data15, h1_data16, h1_data17, h1_data18, h1_data19, h1_data2, h1_data20, h1_data21, h1_data22, h1_data23, h1_data24, h1_data25, h1_data26, h1_data27, h1_data2  
8, h1_data29, h1_data3, h1_data30, h1_data4, h1_data5, h1_data6, h1_data7, h1_data8, h1_data9, h2_data1, h2_data10, h2_data11, h2_data12, h2_data13, h2_data14, h2_data15, h2_data16, h2_data17, h2_data18, h2_data19, h2_data2, h2_data20, h  
2_data21, h2_data22, h2_data23, h2_data24, h2_data25, h2_data26, h2_data27, h2_data28, h2_data29, h2_data3, h2_data30, h2_data4, h2_data5, h2_data6, h2_data7, h2_data8, h2_data9, h3_data1, h3_data10, h3_data11, h3_data12, h3_data13, h3_d  
ata14, h3_data15, h3_data16, h3_data17, h3_data18, h3_data19, h3_data2, h3_data20, h3_data21, h3_data22, h3_data23, h3_data24, h3_data25, h3_data26, h3_data27, h3_data28, h3_data29, h3_data3, h3_data30, h3_data4, h3_data5, h3_data6, h3_d  
ata7, h3_data8, h3_data9, h4_data1, h4_data10, h4_data11, h4_data12, h4_data13, h4_data14, h4_data15, h4_data16, h4_data17, h4_data18, h4_data19, h4_data2, h4_data20, h4_data21, h4_data22, h4_data23, h4_data24, h4_data25, h4_data26, h4_d  
ata27, h4_data28, h4_data29, h4_data3, h4_data30, h4_data4, h4_data5, h4_data6, h4_data7, h4_data8, h4_data9, h5_data1, h5_data10, h5_data11, h5_data12, h5_data13, h5_data14, h5_data15, h5_data16, h5_data17, h5_data18, h5_data19, h5_data  
2, h5_data20, h5_data21, h5_data22, h5_data23, h5_data24, h5_data25, h5_data26, h5_data27, h5_data28, h5_data29, h5_data3, h5_data30, h5_data4, h5_data5, h5_data6, h5_data7, h5_data8, h5_data9, h6_data1, h6_data10, h6_data11, h6_data12,   
h6_data13, h6_data14, h6_data15, h6_data16, h6_data17, h6_data18, h6_data19, h6_data2, h6_data20, h6_data21, h6_data22, h6_data23, h6_data24, h6_data25, h6_data26, h6_data27, h6_data28, h6_data29, h6_data3, h6_data30, h6_data4, h6_data5,  
 h6_data6, h6_data7, h6_data8, h6_data9, h7_data1, h7_data10, h7_data11, h7_data12, h7_data13, h7_data14, h7_data15, h7_data16, h7_data17, h7_data18, h7_data19, h7_data2, h7_data20, h7_data21, h7_data22, h7_data23, h7_data24, h7_data25,   
h7_data26, h7_data27, h7_data28, h7_data29, h7_data3, h7_data30, h7_data4, h7_data5, h7_data6, h7_data7, h7_data8, h7_data9, h8_data1, h8_data10, h8_data11, h8_data12, h8_data13, h8_data14, h8_data15, h8_data16, h8_data17, h8_data18, h8_  
data19, h8_data2, h8_data20, h8_data21, h8_data22, h8_data23, h8_data24, h8_data25, h8_data26, h8_data27, h8_data28, h8_data29, h8_data3, h8_data30, h8_data4, h8_data5, h8_data6, h8_data7, h8_data8, h8_data9  
   Remote query: SELECT ctid, tableoid, info FROM t_regexp_100billion WHERE (info ~ '^80ebcdd47'::text)  
 Planning time: 0.061 ms  
 Execution time: 3112.296 ms  
(6 rows)  
Time: 3139.928 ms  

字尾匹配查詢速度

digoal=> select ctid,tableoid,info from t_regexp_100billion where reverse(info) ~ '^f42d12089b';  
     ctid      | tableoid |     info       
---------------+----------+--------------  
 (124741,26)   |    16677 | f3b98021d24f  
 (1696888,151) |    16659 | f3b98021d24f  
 (1278911,101) |    16659 | f3b98021d24f  
 (1427480,157) |    16659 | f3b98021d24f  
 (449192,30)   |    16659 | f3b98021d24f  
 (1833887,81)  |    16659 | f3b98021d24f  
 (229525,72)   |    16659 | f3b98021d24f  
 (1353789,17)  |    16659 | f3b98021d24f  
 (1875911,148) |    16659 | f3b98021d24f  
 (1847078,35)  |    16659 | f3b98021d24f  
 (316780,156)  |    16659 | f3b98021d24f  
 (1265453,120) |    16659 | f3b98021d24f  
 (100075,60)   |    16659 | f3b98021d24f  
 (1924176,2)   |    16659 | f3b98021d24f  
 (279583,2)    |    16659 | f3b98021d24f  
 (1631226,23)  |    16659 | f3b98021d24f  
 (1906666,50)  |    16659 | f3b98021d24f  
 (1640803,116) |    16659 | f3b98021d24f  
 (629651,46)   |    16659 | f3b98021d24f  
 (134982,13)   |    16659 | f3b98021d24f  
 (380660,123)  |    16659 | f3b98021d24f  
 (2158193,31)  |    16659 | f3b98021d24f  
 (324901,64)   |    16659 | f3b98021d24f  
 (1243973,160) |    16659 | f3b98021d24f  
 (540958,139)  |    16659 | f3b98021d24f  
 (441475,99)   |    16659 | f3b98021d24f  
 (1207114,121) |    16659 | f3b98021d24f  
 (574598,21)   |    16659 | f3b98021d24f  
 (1253283,185) |    16659 | f3b98021d24f  
 (1396717,142) |    16659 | f3b98021d24f  
 (149738,9)    |    16659 | f3b98021d24f  
 (764749,26)   |    16659 | f3b98021d24f  
 (1211899,5)   |    16659 | f3b98021d24f  
 (1626746,65)  |    16659 | f3b98021d24f  
 (1342895,124) |    16659 | f3b98021d24f  
 (733794,136)  |    16659 | f3b98021d24f  
 (417796,2)    |    16659 | f3b98021d24f  
 (555520,163)  |    16659 | f3b98021d24f  
 (232038,105)  |    16659 | f3b98021d24f  
 (355107,127)  |    16659 | f3b98021d24f  
 (352143,175)  |    16662 | f3b98021d24f  
 (1856293,69)  |    16662 | f3b98021d24f  
 (1405106,105) |    16662 | f3b98021d24f  
 (47689,79)    |    16662 | f3b98021d24f  
 (679310,7)    |    16671 | f3b98021d24f  
 (1076234,164) |    16671 | f3b98021d24f  
(46 rows)  
Time: 3140.835 ms  


digoal=> explain (verbose,costs,timing,buffers,analyze) select ctid,tableoid,info from t_regexp_100billion where reverse(info) ~ '^f42d12089b';  
 Remote Fast Query Execution  (cost=0.00..0.00 rows=0 width=0) (actual time=3085.738..3112.216 rows=46 loops=1)  
   Output: t_regexp_100billion.ctid, t_regexp_100billion.tableoid, t_regexp_100billion.info  
   Node/s: h1_data1, h1_data10, h1_data11, h1_data12, h1_data13, h1_data14, h1_data15, h1_data16, h1_data17, h1_data18, h1_data19, h1_data2, h1_data20, h1_data21, h1_data22, h1_data23, h1_data24, h1_data25, h1_data26, h1_data27, h1_data2  
8, h1_data29, h1_data3, h1_data30, h1_data4, h1_data5, h1_data6, h1_data7, h1_data8, h1_data9, h2_data1, h2_data10, h2_data11, h2_data12, h2_data13, h2_data14, h2_data15, h2_data16, h2_data17, h2_data18, h2_data19, h2_data2, h2_data20, h  
2_data21, h2_data22, h2_data23, h2_data24, h2_data25, h2_data26, h2_data27, h2_data28, h2_data29, h2_data3, h2_data30, h2_data4, h2_data5, h2_data6, h2_data7, h2_data8, h2_data9, h3_data1, h3_data10, h3_data11, h3_data12, h3_data13, h3_d  
ata14, h3_data15, h3_data16, h3_data17, h3_data18, h3_data19, h3_data2, h3_data20, h3_data21, h3_data22, h3_data23, h3_data24, h3_data25, h3_data26, h3_data27, h3_data28, h3_data29, h3_data3, h3_data30, h3_data4, h3_data5, h3_data6, h3_d  
ata7, h3_data8, h3_data9, h4_data1, h4_data10, h4_data11, h4_data12, h4_data13, h4_data14, h4_data15, h4_data16, h4_data17, h4_data18, h4_data19, h4_data2, h4_data20, h4_data21, h4_data22, h4_data23, h4_data24, h4_data25, h4_data26, h4_d  
ata27, h4_data28, h4_data29, h4_data3, h4_data30, h4_data4, h4_data5, h4_data6, h4_data7, h4_data8, h4_data9, h5_data1, h5_data10, h5_data11, h5_data12, h5_data13, h5_data14, h5_data15, h5_data16, h5_data17, h5_data18, h5_data19, h5_data  
2, h5_data20, h5_data21, h5_data22, h5_data23, h5_data24, h5_data25, h5_data26, h5_data27, h5_data28, h5_data29, h5_data3, h5_data30, h5_data4, h5_data5, h5_data6, h5_data7, h5_data8, h5_data9, h6_data1, h6_data10, h6_data11, h6_data12,   
h6_data13, h6_data14, h6_data15, h6_data16, h6_data17, h6_data18, h6_data19, h6_data2, h6_data20, h6_data21, h6_data22, h6_data23, h6_data24, h6_data25, h6_data26, h6_data27, h6_data28, h6_data29, h6_data3, h6_data30, h6_data4, h6_data5,  
 h6_data6, h6_data7, h6_data8, h6_data9, h7_data1, h7_data10, h7_data11, h7_data12, h7_data13, h7_data14, h7_data15, h7_data16, h7_data17, h7_data18, h7_data19, h7_data2, h7_data20, h7_data21, h7_data22, h7_data23, h7_data24, h7_data25,   
h7_data26, h7_data27, h7_data28, h7_data29, h7_data3, h7_data30, h7_data4, h7_data5, h7_data6, h7_data7, h7_data8, h7_data9, h8_data1, h8_data10, h8_data11, h8_data12, h8_data13, h8_data14, h8_data15, h8_data16, h8_data17, h8_data18, h8_  
data19, h8_data2, h8_data20, h8_data21, h8_data22, h8_data23, h8_data24, h8_data25, h8_data26, h8_data27, h8_data28, h8_data29, h8_data3, h8_data30, h8_data4, h8_data5, h8_data6, h8_data7, h8_data8, h8_data9  
   Remote query: SELECT ctid, tableoid, info FROM t_regexp_100billion WHERE (reverse(info) ~ '^f42d12089b'::text)  
 Planning time: 0.063 ms  
 Execution time: 3112.236 ms  
(6 rows)  

Time: 3139.890 ms  

前後模糊查詢速度:

digoal=> select ctid,tableoid,info from t_regexp_100billion where info ~ 'e7add04871';  
     ctid      | tableoid |     info       
---------------+----------+--------------  
 (124741,45)   |    16677 | be7add048713  
 (49315,69)    |    16659 | be7add048713  
 (1770876,21)  |    16659 | be7add048713  
 (199079,143)  |    16659 | be7add048713  
 (151110,141)  |    16659 | be7add048713  
 (1597384,137) |    16659 | be7add048713  
 (1693453,25)  |    16659 | be7add048713  
 (101576,132)  |    16659 | be7add048713  
 (1110249,50)  |    16659 | be7add048713  
 (792326,68)   |    16659 | be7add048713  
 (1676705,68)  |    16659 | be7add048713  
 (1269148,101) |    16659 | be7add048713  
 (1027442,113) |    16659 | be7add048713  
 (1078144,100) |    16659 | be7add048713  
 (584038,141)  |    16659 | be7add048713  
 (1245454,80)  |    16659 | be7add048713  
 (1551184,102) |    16659 | 
            
           

相關推薦

PostgreSQL 1000料量 匹配 速度激情

測試環境為 8臺主機(16c/host)的 PostgreSQL叢集,一共240個數據節點,測試資料量1008億。 效能圖表 : 如果要獲得更快的響應速度,可以通過增加主機和節點數(或者通過增加CPU和節點數),縮短recheck的處理時間。 資料生成方法: #!/bin/bash #

linux下分割字串已經如何匹配日期IP

今天專案需要在linux下將一個字串中的ip與日期提取出來,因為查了挺多資料,記到這裡方便以後檢視。 linux下分割字串 linux下分割字串可以使用命令expr,expr有許多功能,具體的使用方法可以使用man檢視,這裡只介紹分割字串的功能。 ex

postgresql使用匹配IP地址

在查詢某表的資料時,對錶中的ip進行正則匹配: select '192.168.14.29' ~ '^((?:(?:25[0-5]|2[0-4]\\d|((1\\d{2})|([1-9]?\\d)))(?:\\.)){3}(?:25[0-5]|2[0-4]\\d|((1\\

python中匹配字符配置單詞邊界不生效的解決辦法

re python duoceshi #-*-coding:utf-8-*-import rename="duoceshi"p= re.compile(‘\bduoceshi\b‘)f = p.search(name)if f: print f.group()################

匹配 替換..追加..

bbs csdn 正則 flow code pan net eval nbsp 這裏都是以 圖片中的元素為例: 匹配出IMG標簽中alt的值: 1 Regex reg = new Regex(@"(?is)(?<=<img[^>]*)[^""]*(?

day11 grep匹配

collect lec linux 取反 pat 至少 判斷 con set ps aus | trep nginx # 查看所有正在運行的nginx任務 別名路徑: alias test_cmd=‘ls -l‘ PATH路徑: 臨時修改:

常用的匹配

marked clas 字符串 輸入 har round back [0 num 1.判斷只能輸入數字和字母 var num_char = /^[0-9A-Za-z]+$/;   ^ :代表匹配字符串開始位置;   [0-9A-Za-z]+ :[0-9A-Za-z]匹配數

js 對表單的一些驗證及匹配

攻擊 update 匹配規則 asc htm out gin lease public 利用的是jq的validate.js 詳見菜鳥教程http://www.runoob.com/jquery/jquery-plugin-validate.html 以下是我測試的幾個文件

匹配所有的a標簽

結束 strong 分組 正則匹配 ref val 所有 a標簽 解釋 <a\b[^>]+\bhref="([^"]*)"[^>]*>([\s\S]*?)</a> 分組1和分組2即為href和value 解釋: <a\b

關於JAVA匹配空白字符的問題(全角空格半角空格)

轉義 空白 測試 rgs com text color 如何 clas 今天遇到一個字符串,怎麽匹配空格都不成功!!! 我把空格復制到test.properties文件 顯示“\u3000” ,這是什麽? 這是全角空格!!! 查了一下 \s

匹配<img>

普通 空白字符 展開 反向引用 功能 php php應用 換行 一個 preg_match_all(‘/<img(.*?)src=\"(.*?)\"(.*?)>/is‘, $content, $matches); matches[0] 整個img標簽 match

js匹配的出鏈接地址

鏈接地址 匹配 ase lower length ont 正則匹配 nbsp case content為需要匹配的值 var b=/<a([\s]+|[\s]+[^<>]+[\s]+)href=(\"([^<>"\‘]*)\"|\‘([^

awk結合匹配

需要 上海 所有 統計 技術 領域 panda -1 數據處理 利用awk分析data.csv中label列各取值的分布. 在終端執行head data.csv查看數據: 1 name,business,label,label_name 2 滄州光松房屋拆遷有限公

匹配方法

blank csdn 關於 expr 取ip地址 數值 換ip 表達式 java 這裏是幾個主要非英文語系字符範圍(google上找到的): 2E80~33FFh:中日韓符號區。收容康熙字典部首、中日韓輔助部首、註音符號、日本假名、韓文音符,中日韓的符號、標點、帶圈或帶括

修正匹配日期---基於網絡未知大神的

http 日期 bsp question ges 基於 就會 貢獻 工作 今天工作時需要用到日期格式檢驗,於是發現未知的大神貢獻的一套正則表達式【1】,看起來很復雜; 但是經過測試發現有些問題: ((\d{2}(([02468][048])|([13579][26]

java匹配

java 成功 println 字符 示例代碼 括號 lan string main java正則提取需要用到Matcher類,下面給出案例示例供參考需要提取車牌號中最後一個數字,比如說:蘇A7865提取5,蘇A876X提取6import java.util.regex.M

python3 匹配[^abc]和(?!abc)的區別(把多個字符作為一個整體匹配排除)

mat obj python str 效果 目的 str1 排除 blog 目的:把數字後面不為abc的字符串找出來 如1ab符合要求,2abc不符合要求 1 str = ‘1ab‘ 2 out = re.match(r‘\d+(?!abc)‘,str) 3 4

python匹配——中文字符的匹配

pri bsp odi col div class cnblogs mat 結果 # -*- coding:utf-8 -*- import re ‘‘‘python 3.5版本 正則匹配中文,固定形式:\u4E00-\u9FA5 ‘‘‘ words = ‘stud

php 匹配出a標簽級a標簽中的內容

har set ext htm file 鏈接地址 header char pre <?phpheader("Content-type: text/html; charset=utf-8"); $str=file_get_contents("https://www.

re模塊 匹配

reimport rere.M 多行模式 位或的意思parrterm就是正則表達式的字符串,flags是選項,表達式需要被編譯,通過語法、策劃、分析後衛其編譯為一種格式,與字符串之間進行轉換re模塊主要為了提速,re的其他方法為了提高效率都調用了編譯方法,就是為了提速re的方法單次匹配re.compile 和