php – DOM错误 – ID’someAnchor’已在实体中定义,第X行
|
如果我尝试将 HTML文档加载到 PHP DOM中,我会遇到以下错误: Error DOMDocument::loadHTML() [domdocument.loadhtml]: ID someAnchor already defined in Entity,line: 9 我无法理解为什么.这是一些将HTML字符串加载到DOM中的代码. 首先不包含锚标记,第二个包含锚标记.第二个文档产生错误. 希望您能够将其剪切并粘贴到脚本中并运行它以查看相同的输出: <?php
ini_set('display_errors',1);
error_reporting(E_ALL);
$stringWithNoAnchor = <<<EOT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body >
<h1>Hello</h1>
</body>
</html>
EOT;
$stringWithAnchor = <<<EOT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body >
<h1>Hello</h1>
<a name="someAnchor" id="someAnchor"></a>
</body>
</html>
EOT;
class domGrabber
{
public $_FileErrorStr = '';
/**
*@desc DOM object factory does the work of loading the DOM object
*/
public function getLoadAsDOMObj($htmlString)
{
$this->_FileErrorStr =''; //reset error container
$xmlDoc = new DOMDocument();
set_error_handler(array($this,'_FileErrorHandler')); // Warnings and errors are suppressed
$xmlDoc->loadHTML($htmlString);
restore_error_handler();
return $xmlDoc;
}
/**
*@desc public so that it can catch errors from outside this class
*/
public function _FileErrorHandler($errno,$errstr,$errfile,$errline)
{
if ($this->_FileErrorStr === null)
{
$this->_FileErrorStr = $errstr;
}
else {
$this->_FileErrorStr .= (PHP_EOL . $errstr);
}
}
}
$domGrabber = new domGrabber();
$xmlDoc = $domGrabber->getLoadAsDOMObj($stringWithNoAnchor );
echo 'PHP Version: '. phpversion() .'<br />'."n";
echo '<pre>';
print $xmlDoc->saveXML();
echo '</pre>'."n";
if ($domGrabber->_FileErrorStr)
{
echo 'Error'. $domGrabber->_FileErrorStr;
}
$xmlDoc = $domGrabber->getLoadAsDOMObj($stringWithAnchor);
echo '<pre>';
print $xmlDoc->saveXML();
echo '</pre>'."n";
if ($domGrabber->_FileErrorStr)
{
echo 'Error'. $domGrabber->_FileErrorStr;
}
我在Firefox源代码视图中输入以下内容: PHP Version: 5.2.9<br /> <pre><?xml version="1.0" encoding="iso-8859-1" standalone="yes"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns="http://www.w3.org/1999/xhtml"><head><title>My document</title><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /></head><body> <h1>Hello</h1> </body></html> </pre> <pre><?xml version="1.0" encoding="iso-8859-1" standalone="yes"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns="http://www.w3.org/1999/xhtml"><head><title>My document</title><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /></head><body> <h1>Hello</h1> <a name="someAnchor" id="someAnchor"></a> </body></html> </pre> Error DOMDocument::loadHTML() [<a href='domdocument.loadhtml'>domdocument.loadhtml</a>]: ID someAnchor already defined in Entity,line: 9 那么,为什么DOM说someAnchor已经被定义了? 更新: 我试验了两者 >而不是使用loadHTML()我使用了loadXML()方法 – 并修复了它 为完成起见,请参阅此处的比较脚本: <?php
ini_set('display_errors',1);
error_reporting(E_ALL);
$stringWithNoAnchor = <<<EOT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body >
<p>stringWithNoAnchor</p>
</body>
</html>
EOT;
$stringWithAnchor = <<<EOT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body >
<p>stringWithAnchor</p>
<a name="someAnchor" id="someAnchor" ></a>
</body>
</html>
EOT;
$stringWithAnchorButOnlyIdAtt = <<<EOT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body >
<p>stringWithAnchorButOnlyIdAtt</p>
<a id="someAnchor"></a>
</body>
</html>
EOT;
class domGrabber
{
public $_FileErrorStr = '';
public $useHTMLMethod = TRUE;
/**
*@desc DOM object factory does the work of loading the DOM object
*/
public function loadDOMObjAndWriteOut($htmlString)
{
$this->_FileErrorStr ='';
$xmlDoc = new DOMDocument();
set_error_handler(array($this,'_FileErrorHandler')); // Warnings and errors are suppressed
if ($this->useHTMLMethod)
{
$xmlDoc->loadHTML($htmlString);
}
else {
$xmlDoc->loadXML($htmlString);
}
restore_error_handler();
echo "<h1>";
echo ($this->useHTMLMethod) ? 'using xmlDoc->loadHTML() ' : 'using $xmlDoc->loadXML()';
echo "</h1>";
echo '<pre>';
print $xmlDoc->saveXML();
echo '</pre>'."n";
if ($this->_FileErrorStr)
{
echo 'Error'. $this->_FileErrorStr;
}
}
/**
*@desc public so that it can catch errors from outside this class
*/
public function _FileErrorHandler($errno,$errline)
{
if ($this->_FileErrorStr === null)
{
$this->_FileErrorStr = $errstr;
}
else {
$this->_FileErrorStr .= (PHP_EOL . $errstr);
}
}
}
$domGrabber = new domGrabber();
echo 'PHP Version: '. phpversion() .'<br />'."n";
$domGrabber->useHTMLMethod = TRUE; //DOM->loadHTML
$domGrabber->loadDOMObjAndWriteOut($stringWithNoAnchor);
$domGrabber->loadDOMObjAndWriteOut($stringWithAnchor );
$domGrabber->loadDOMObjAndWriteOut($stringWithAnchorButOnlyIdAtt);
$domGrabber->useHTMLMethod = FALSE; //use DOM->loadXML
$domGrabber->loadDOMObjAndWriteOut($stringWithNoAnchor);
$domGrabber->loadDOMObjAndWriteOut($stringWithAnchor );
$domGrabber->loadDOMObjAndWriteOut($stringWithAnchorButOnlyIdAtt);
如果要加载XML文件(在这种情况下,XHTML是XML),则应使用
DOMDocument::loadXML(),而不是
DOMDocument::loadHTML().
在HTML中,name和id都引入了ID.所以你重复id“someAnchor”,因此错误. 但是,W3C验证器允许以您显示的形式重复ID< a id =“someAnchor”name =“someAnchor”>< / a>.这可能是libmxl2的错误. 在这个bug report for libxml2中,用户建议修补程序仅将name属性视为ID:
|
