一、需求
为方便用户快速查找某个城市的门店,在门店列表页增加了按城市首字母筛选的功能
二、后端实现
数据库一般会存储所有城市的首字母字段,后端只需要查询库进行返回即可。
但如果全国城市区域有很多,数据库没有存粗字段的情况下,如何实现呢?
一般会有人选择 pinyin4j、经过测试对于多音字的处理并不太友好,于是选择了具有语义的Hanlp的依赖包。能处理大多数的多音字城市
<dependency> <groupId>com.hankcs</groupId> <artifactId>hanlp</artifactId> <version>portable-1.8.3</version> </dependency>
测试类
import com.google.common.collect.ImmutableMap; import com.hankcs.hanlp.HanLP; import com.hankcs.hanlp.dictionary.py.Pinyin; import com.hankcs.hanlp.dictionary.py.PinyinDictionary; import java.util.List; import java.util.Map; public class HanLpTest { public static void main(String[] args) { String text = "重载不是重担," + HanLP.convertToTraditionalChinese("以后爱皇后"); List<Pinyin> pinyinList = PinyinDictionary.convertToPinyin(text); System.out.print("原文,"); for (char c : text.toCharArray()) { System.out.printf("%c,", c); } System.out.println(); System.out.print("拼音(数字音调),"); for (Pinyin pinyin : pinyinList) { System.out.printf("%s,", pinyin); } System.out.println(); System.out.print("拼音(符号音调),"); for (Pinyin pinyin : pinyinList) { System.out.printf("%s,", pinyin.getPinyinWithToneMark()); } System.out.println(); System.out.print("拼音(无音调),"); for (Pinyin pinyin : pinyinList) { System.out.printf("%s,", pinyin.getPinyinWithoutTone()); } System.out.println(); System.out.print("声调,"); for (Pinyin pinyin : pinyinList) { System.out.printf("%s,", pinyin.getTone()); } System.out.println(); System.out.print("声母,"); for (Pinyin pinyin : pinyinList) { System.out.printf("%s,", pinyin.getShengmu()); } System.out.println(); System.out.print("韵母,"); for (Pinyin pinyin : pinyinList) { System.out.printf("%s,", pinyin.getYunmu()); } System.out.println(); System.out.print("输入法头,"); for (Pinyin pinyin : pinyinList) { System.out.printf("%s,", pinyin.getHead()); } System.out.println(); } } 执行结果 原文,重,载,不,是,重,担,,,以,後,愛,皇,后, 拼音(数字音调),chong2,zai3,bu2,shi4,zhong4,dan4,none5,yi3,hou4,ai4,huang2,hou4, 拼音(符号音调),chóng,zǎi,bú,shì,zhòng,dàn,none,yǐ,hòu,ài,huáng,hòu, 拼音(无音调),chong,zai,bu,shi,zhong,dan,none,yi,hou,ai,huang,hou, 声调,2,3,2,4,4,4,5,3,4,4,2,4, 声母,ch,z,b,sh,zh,d,none,y,h,none,h,h, 韵母,ong,ai,u,i,ong,an,none,i,ou,ai,uang,ou, 输入法头,ch,z,b,sh,zh,d,none,y,h,a,h,h,
多音字城市处理
import com.google.common.collect.ImmutableMap; import com.hankcs.hanlp.dictionary.py.PinyinDictionary; import java.util.Map; public class HanLpTest { // 特殊的多音字映射 private static Map<String, String> capitalizeMapping = ImmutableMap.of( "朝阳","C", "朝阳市","C"); public static void main(String[] args) { System.out.println(String.valueOf(PinyinDictionary.convertToPinyin("长沙").get(0).getPinyinWithoutTone().charAt(0)).toUpperCase()); // C System.out.println(String.valueOf(PinyinDictionary.convertToPinyin("重庆").get(0).getPinyinWithoutTone().charAt(0)).toUpperCase()); // C System.out.println(String.valueOf(PinyinDictionary.convertToPinyin("厦门").get(0).getPinyinWithoutTone().charAt(0)).toUpperCase()); // X System.out.println(String.valueOf(PinyinDictionary.convertToPinyin("朝阳").get(0).getPinyinWithoutTone().charAt(0)).toUpperCase()); // Z System.out.println(String.valueOf(PinyinDictionary.convertToPinyin("朝阳市").get(0).getPinyinWithoutTone())); // zhao System.out.println(capitalizeMapping.get("朝阳市")); // C } }
未经允许请勿转载:程序喵 » Java 城市汉字转拼音,多音字处理(Hanlp)