人们想要使用python的re模块来过滤掉JavaScript中的注释,尤其是单行注释(以“//”开始)。但他们在努力了很长时间后都没有成功,因此寻求帮助。
2、解决方案
答案1
一位热心的回答者提供了详细的解决方案:
- 使用re.compile()函数创建一个正则表达式对象,其中包含一个复杂的正则表达式,可以匹配代码、多行注释和单行注释。
- 使用re.findall()函数将正则表达式应用于JavaScript代码,并将匹配的结果存储在一个列表中。
- 从列表中提取代码、多行注释和单行注释,并分别打印出来。
答案2
另一位回答者提供了一种更简单的解决方案:
- 使用re.compile()函数创建一个包含“//.*$”正则表达式的正则表达式对象。
- 使用re.match()函数将正则表达式应用于每行JavaScript代码,并将匹配的结果存储在一个列表中。
- 从列表中提取单行注释,并打印出来。
代码例子
import re
# 答案1中的正则表达式
reexpr = r"""
( # Capture code
"(?:\.|[^"\])*" # String literal
|
'(?:\.|[^'\])*' # String literal
|
(?:[^/\n"']|/[^/*\n"'])+ # Any code besides newlines or string literals
|
\n # Newline
)|
(/* (?:[^*]|*[^/])* */) # Multi-line comment
|
(?://(.*)$) # Comment
$"""
rx = re.compile(reexpr, re.VERBOSE + re.MULTILINE)
# 答案2中的正则表达式
rx = re.compile(r'.*(//(.*))$')
# 输入的JavaScript代码
code = r"""// this is a comment
var x = 2 * 4 // and this is a comment too
var url = "http://www.google.com/" // and "this" too
url += 'but // this is not a comment' // however this one is
url += 'this "is not a comment' + " and ' neither is this " // only this
bar = 'http://no.comments.com/' // these // are // comments
bar = 'text // string ' no // more //\' // comments
bar = 'http://no.comments.com/'
bar = /var/ // comment
/* comment 1 */
bar = open() /* comment 2 */
bar = open() /* comment 2b */// another comment
bar = open( /* comment 3 */ file) // another comment
"""
# 答案1的处理过程
parts = rx.findall(code)
print('*' * 80, '\nCode:\n\n', '\n'.join([x[0] for x in parts if x[0].strip()]))
print('*' * 80, '\nMulti line comments:\n\n', '\n'.join([x[1] for x in parts if x[1].strip()]))
print('*' * 80, '\nOne line comments:\n\n', '\n'.join([x[2] for x in parts if x[2].strip()]))
# 答案2的处理过程
lines = ["// this is a comment",
"var x = 2 // and this is a comment too",
"""var url = "http://www.google.com/" // and "this" too""",
"""url += 'but // this is not a comment' // however this one is""",
"""url += 'this "is not a comment' + " and ' neither is this " // only this""",]
for line in lines:
print(rx.match(line).groups())
输出结果:
********************************************************************************
Code:
var x = 2
var url = "http://www.google.com/"
url += 'but '
url += 'this "is not a comment' + " and ' neither is this "
bar = 'http://no.comments.com/'
bar = 'text // string ' no more '
bar = 'http://no.comments.com/'
bar = /var/
bar = open()
bar = open()
bar = open( file)
********************************************************************************
Multi line comments:
/* comment 1 */
/* comment 2 */
/* comment 2b */// another comment
/* comment 3 */
********************************************************************************
One line comments:
this is a comment
and this is a comment too
and "this" too
however this one is
only this
these // are // comments
// string ' no // more //\' // comments
comment
another comment
********************************************************************************
('// this is a comment', ' this is a comment')
('// and this is a comment too', ' and this is a comment too')
('// and "this" too', ' and "this" too')
('// however this one is', ' however this one is')
('// only this', ' only this')