MySQL中字符串查询如何提高效率

本文介绍了 MySQL中字符串查询如何提高效率

1、MySQL中如何提高字符串查询效率?
对字符串加索引?一般情况下,是不建议在字符串加索引,占空间
如果一定要加,建议可以指定长度,前提是字符串前面部分区分度好的话,此时这类索引就叫前缀索引

2、前缀索引的问题

区分度不好的话,很容易发生碰撞,进而引发一系列问题。

前缀索引和普通索引在只有where条件、order by和group by不同执行情况,可以看到Extra的说明,前缀索引只有where条件,无法使用覆盖索引,order by会使用filesort,group by会使用temporary和filesort
总的来说,前缀索引无法使用覆盖索引,进而导致order by和group by要使用文件排序,甚至临时表

3、字符串加不加索引的测试

1、不加索引
查询的SQL:SELECT SQL_NO_CACHE * FROM string_index_test WHERE name=‘forlan’;

Benchmark
        Average number of seconds to run all queries: 8.328 seconds
        Minimum number of seconds to run all queries: 8.328 seconds
        Maximum number of seconds to run all queries: 8.328 seconds
        Number of clients running queries: 100
        Average number of queries per client: 0

Benchmark
        Average number of seconds to run all queries: 18.078 seconds
        Minimum number of seconds to run all queries: 18.078 seconds
        Maximum number of seconds to run all queries: 18.078 seconds
        Number of clients running queries: 200
        Average number of queries per client: 0

2、加字符串索引

alter table string_index_test add index idx_name (name) USING BTREE;

查询的SQL:SELECT SQL_NO_CACHE * FROM string_index_test WHERE name=‘forlan’;

Benchmark
        Average number of seconds to run all queries: 0.250 seconds
        Minimum number of seconds to run all queries: 0.250 seconds
        Maximum number of seconds to run all queries: 0.250 seconds
        Number of clients running queries: 100
        Average number of queries per client: 0

Benchmark
        Average number of seconds to run all queries: 1.438 seconds
        Minimum number of seconds to run all queries: 1.438 seconds
        Maximum number of seconds to run all queries: 1.438 seconds
        Number of clients running queries: 200
        Average number of queries per client: 0

4、使用CRC32创建索引

CRC全称为Cyclic Redundancy Check,又叫循环冗余校验。
CRC32是CRC算法的一种,返回值的范围0~2^32-1,使用bigint存储

加一个name_crc32列,创建这个列的所有,索引空间小很多,利用整型加速查询

加索引:alter table string_index_test add index idx_nam_crc32 (name_crc32) USING BTREE;
查询的SQL:SELECT SQL_NO_CACHE * FROM string_index_test WHERE name_crc32=CRC32(‘forlan’) and name=‘forlan’;

因为CRC32存在发生碰撞,所以加上name条件,才能筛选出正确的数据

Benchmark
        Average number of seconds to run all queries: 0.266 seconds
        Minimum number of seconds to run all queries: 0.266 seconds
        Maximum number of seconds to run all queries: 0.266 seconds
        Number of clients running queries: 100
        Average number of queries per client: 0

Benchmark
        Average number of seconds to run all queries: 0.390 seconds
        Minimum number of seconds to run all queries: 0.390 seconds
        Maximum number of seconds to run all queries: 0.390 seconds
        Number of clients running queries: 200
        Average number of queries per client: 0

总结:

  • 通过对字符串加索引,可以提高查询效率,但需要注意指定长度,无法使用覆盖索引
  • 通过使用CRC32,需要额外存一个字段,将字符串转为整数存储,节省空间,效率提升并不是很大,但存在碰撞问题,可以加多字符串筛选条件
  • 对于CRC32存在碰撞问题,可以使用CRC64减少碰撞,但需要安装 common_schema database函数库
上一篇 下一篇


推荐文章

评论
说点什么吧?

发表评论

取消回复
  最新文章