您好,欢迎来到三六零分类信息网!老站,搜索引擎当天收录,欢迎发信息
免费发信息
三六零分类信息网 > 朝阳分类信息网,免费分类信息发布

python中如何去除标点符号

2024/2/25 1:16:05发布22次查看
python去掉标点符号的方法如下:
方法一:
str.isalnum:
s.isalnum() -> bool
返回值:如果string至少有一个字符并且所有字符都是字母或数字则返回true,否则返回false。
实例:
>>> string = "special $#! characters spaces 888323">>> ''.join(e for e in string if e.isalnum())'specialcharactersspaces888323'
只能识别字母和数字,杀伤力大,会把中文、空格之类的也干掉
方法二:
string.punctuation
import re, strings ="string. with. punctuation?" # sample string # 写法一:out = s.translate(string.maketrans("",""), string.punctuation)# 写法二:out = s.translate(none, string.punctuation)# 写法三:exclude = set(string.punctuation)out = ''.join(ch for ch in s if ch not in exclude)# 写法四:>>> for c in string.punctuation: s = s.replace(c,"")>>> s'string with punctuation'# 写法五:out = re.sub('[%s]' % re.escape(string.punctuation), '', s)## re.escape:对字符串中所有可能被解释为正则运算符的字符进行转义# 写法六:# string.punctuation 只包括 ascii 格式; 想要一个包含更广(但是更慢)的方法是使用: unicodedata module :from unicodedata import categorys = u'string — with - «punctuation »...'out = re.sub('[%s]' % re.escape(string.punctuation), '', s)print 'stripped', out# 输出:u'stripped string \u2014 with \xabpunctuation \xbb'out = ''.join(ch for ch in s if category(ch)[0] != 'p')print 'stripped', out# 输出:u'stripped string with punctuation '# for python 3 str or python 2 unicode values, str.translate() only takes a dictionary; codepoints (integers) are looked up in that mapping and anything mapped to none is removed.# to remove (some?) punctuation then, use:import stringremove_punct_map = dict.fromkeys(map(ord, string.punctuation))s.translate(remove_punct_map)# your method doesn't work in python 3, as the translate method doesn't accept the second argument any more. import unicodedataimport systbl = dict.fromkeys(i for i in range(sys.maxunicode) if unicodedata.category(chr(i)).startswith('p'))def remove_punctuation(text): return text.translate(tbl)
方法三:
re
例:
import res ="string. with. punctuation?"s = re.sub(r'[^\w\s]','',s)
测试:
import re, string, timeits ="string. with. punctuation"exclude = set(string.punctuation)table = string.maketrans("","")regex = re.compile('[%s]' % re.escape(string.punctuation))def test_set(s): return ''.join(ch for ch in s if ch not in exclude)def test_re(s): return regex.sub('', s)def test_trans(s): return s.translate(table, string.punctuation)def test_repl(s): for c in string.punctuation: s=s.replace(c,"") return sprint"sets :",timeit.timer('f(s)', 'from __main__ import s,test_set as f').timeit(1000000)print"regex :",timeit.timer('f(s)', 'from __main__ import s,test_re as f').timeit(1000000)print"translate :",timeit.timer('f(s)', 'from __main__ import s,test_trans as f').timeit(1000000)print"replace :",timeit.timer('f(s)', 'from __main__ import s,test_repl as f').timeit(1000000)out_put:# sets : 19.8566138744# regex : 6.86155414581# translate : 2.12455511093# replace : 28.4436721802
更多python相关技术文章,请访问python教程栏目进行学习!
以上就是python中如何去除标点符号的详细内容。
朝阳分类信息网,免费分类信息发布

VIP推荐

免费发布信息,免费发布B2B信息网站平台 - 三六零分类信息网 沪ICP备09012988号-2
企业名录