Class HtmlHelper
- Namespace
- SunamoHtml.Html
- Assembly
- SunamoHtml.dll
EN: Shared HTML helper methods (mix of various utilities - consider splitting into more specific classes). CZ: Sdílené HTML pomocné metody (mix různých utilit - zvažte rozdělení do specifičtějších tříd).
public static class HtmlHelper
- Inheritance
-
HtmlHelper
- Inherited Members
Methods
ClearSpaces(string)
Clears all space characters (nbsp and regular spaces) from text.
public static string ClearSpaces(string text)
Parameters
textstringThe text to clear spaces from.
Returns
- string
Text without spaces.
ConvertHtmlToText(string)
Converts HTML to plain text by decoding HTML entities, replacing BR tags with newlines, and stripping all tags.
public static string ConvertHtmlToText(string htmlContent)
Parameters
htmlContentstringThe HTML content to convert.
Returns
- string
Plain text without HTML tags.
ConvertTextToHtml(string)
Converts plain text to HTML by replacing newlines with BR tags.
public static string ConvertTextToHtml(string text)
Parameters
textstringThe text to convert.
Returns
- string
HTML with BR tags instead of newlines.
DeleteAttributesFromAllNodes(IList<HtmlNode>)
Deletes all attributes from all HTML nodes in a list.
public static void DeleteAttributesFromAllNodes(IList<HtmlNode> nodes)
Parameters
GetTag(HtmlNode, string)
Returns the first child tag with the specified original name.
public static HtmlNode? GetTag(HtmlNode htmlNode, string tagName)
Parameters
Returns
- HtmlNode
First matching child tag or null.
GetTagOfAtribute(HtmlNode, string, string, string)
EN: Returns the first tag with specified name and attribute value. CZ: Vrátí první tag se zadaným názvem a hodnotou atributu.
public static HtmlNode? GetTagOfAtribute(HtmlNode htmlNode, string tagName, string attributeName, string attributeValue)
Parameters
htmlNodeHtmlNodeThe HTML node to search in.
tagNamestringThe tag name to search for.
attributeNamestringThe attribute name to match.
attributeValuestringThe attribute value to match.
Returns
- HtmlNode
First matching HTML node or null.
GetTagOfAtributeRek(HtmlNode, string, string, string)
Recursively searches for a tag with specified attribute name and value.
public static HtmlNode? GetTagOfAtributeRek(HtmlNode htmlNode, string nameOfTag, string nameOfAttribute, string valueOfAttribute)
Parameters
htmlNodeHtmlNodeThe HTML node to search in.
nameOfTagstringThe tag name to search for.
nameOfAttributestringThe attribute name to match.
valueOfAttributestringThe attribute value to match.
Returns
- HtmlNode
Found HTML node or null.
GetTagsOfAtribute(HtmlNode, string, string, string)
EN: Returns all immediate child tags with specified name and attribute value. CZ: Vrátí všechny přímé podřízené tagy se zadaným názvem a hodnotou atributu.
public static IList<HtmlNode> GetTagsOfAtribute(HtmlNode htmlNode, string tagName, string attributeName, string attributeValue)
Parameters
htmlNodeHtmlNodeThe HTML node to search in.
tagNamestringThe tag name to search for.
attributeNamestringThe attribute name to match.
attributeValuestringThe attribute value to match.
Returns
GetValuesOfStyle(HtmlNode)
EN: Parses the style attribute of an HTML node and returns it as a dictionary. CZ: Naparsuje style atribut HTML uzlu a vrátí ho jako slovník.
public static Dictionary<string, string> GetValuesOfStyle(HtmlNode htmlNode)
Parameters
htmlNodeHtmlNodeThe HTML node to get style values from.
Returns
- Dictionary<string, string>
Dictionary with style property names as keys and values as values.
GetWithoutTextNodes(HtmlNode)
Gets all child nodes excluding text nodes.
public static IList<HtmlNode> GetWithoutTextNodes(HtmlNode htmlNode)
Parameters
htmlNodeHtmlNodeThe HTML node to get children from.
Returns
HasChildTag(HtmlNode, string)
Checks if an HTML node has a child tag with the specified tag name.
public static bool HasChildTag(HtmlNode htmlNode, string tagName)
Parameters
Returns
- bool
True if the node has a child tag with the specified name, false otherwise.
HasTagAttrContains(HtmlNode, string, string, string)
Checks if an HTML node has an attribute whose value, when split by delimiter, contains the specified value.
public static bool HasTagAttrContains(HtmlNode htmlNode, string delimiter, string attributeName, string value)
Parameters
htmlNodeHtmlNodeThe HTML node to check.
delimiterstringThe delimiter to split the attribute value by.
attributeNamestringThe attribute name to check.
valuestringThe value to search for in the split parts.
Returns
- bool
True if the attribute value contains the value after splitting, false otherwise.
HighlightingWords(string, int, int, IList<string>)
EN: Highlights searched words in text content with bold tags, returning sentence snippets. CZ: Zvýrazní hledaná slova v textovém obsahu tučnými tagy, vrátí úryvky vět. Before calling, white space characters must be converted to spaces in the content.
public static string HighlightingWords(string entireContent, int maxLettersPerSentence, int sentenceCount, IList<string> searchedWords)
Parameters
entireContentstringThe entire content to search in.
maxLettersPerSentenceintMaximum letters per sentence snippet.
sentenceCountintNumber of sentence snippets to return.
searchedWordsIList<string>List of words to search for and highlight.
Returns
- string
HTML string with highlighted words in sentence snippets.
PrepareToAttribute(string)
Prepares text for use in HTML attribute by replacing double quotes with single quotes.
public static string PrepareToAttribute(string text)
Parameters
textstringThe text to prepare.
Returns
- string
Text with double quotes replaced by single quotes.
RecursiveReturnTagsWithContainsAttr(IList<HtmlNode>, HtmlNode, string, string, string, bool, bool)
EN: Recursively searches for tags with attribute value matching specified criteria. CZ: Rekurzivně vyhledává tagy s hodnotou atributu odpovídající zadaným kritériím. Supports wildcard "*" for tag name to match all tags.
public static void RecursiveReturnTagsWithContainsAttr(IList<HtmlNode> result, HtmlNode htmlNode, string tagName, string attributeName, string attributeValue, bool isContains, bool isRecursively)
Parameters
resultIList<HtmlNode>The result list to add found nodes to.
htmlNodeHtmlNodeThe HTML node to search in.
tagNamestringThe tag name to search for, or "*" for all tags.
attributeNamestringThe attribute name to check.
attributeValuestringThe attribute value to search for.
isContainsboolWhether to use Contains instead of exact match.
isRecursivelyboolWhether to search recursively.
RemoveAllTags(string)
EN: Removes all HTML tags from text. Just calls StripAllTags method. CZ: Odstraní všechny HTML tagy z textu. Pouze volá metodu StripAllTags. Replaces every tag <*> with a period. Inner non-XML content is left as is.
public static string RemoveAllTags(string text)
Parameters
textstringThe text to remove tags from.
Returns
- string
Text without HTML tags.
ReplaceAllFontCase(string)
Replaces all case variations of BR tag with standard lowercased BR tag.
public static string ReplaceAllFontCase(string html)
Parameters
htmlstringThe HTML string to process.
Returns
- string
HTML with standardized BR tags.
ReplaceChildNodeByOuterHtml(HtmlNode, string, HtmlNode)
EN: Replaces a child node by matching its OuterHtml with a new node. CZ: Nahradí podřízený uzel porovnáním jeho OuterHtml s novým uzlem.
public static void ReplaceChildNodeByOuterHtml(HtmlNode htmlNode, string oldOuterHtml, HtmlNode newNode)
Parameters
htmlNodeHtmlNodeThe parent node containing the child to replace.
oldOuterHtmlstringThe OuterHtml of the child node to replace.
newNodeHtmlNodeThe new node to replace with.
ReplaceHtmlNonPairTagsWithXmlValid(string)
Replaces non-pair HTML tags with XML-valid equivalents (adds self-closing slash). Problematic with auto translate.
public static string ReplaceHtmlNonPairTagsWithXmlValid(string html)
Parameters
htmlstringThe HTML input string.
Returns
- string
HTML with XML-valid non-pair tags.
ReturnAllTags(HtmlNode, params string[])
EN: Returns all child tags matching specified tag names. CZ: Vrátí všechny podřízené tagy odpovídající zadaným názvům tagů.
public static IList<HtmlNode> ReturnAllTags(HtmlNode htmlNode, params string[] tagNames)
Parameters
Returns
ReturnAllTagsImg(HtmlNode, string)
EN: Returns all immediate child tags matching the specified tag name (non-recursive). CZ: Vrátí všechny přímé podřízené tagy odpovídající zadanému názvu (nerekurzivně). If tag is the specified name, doesn't apply recursion on that.
public static IList<HtmlNode> ReturnAllTagsImg(HtmlNode htmlNode, string tagName)
Parameters
htmlNodeHtmlNodeThe HTML node to search in.
tagNamestringThe tag name to search for (e.g., img).
Returns
ReturnApplyToAllTags(string, string, EditHtmlWidthHandler, string)
EN: Returns HTML with all tags of specified type modified by the handler. CZ: Vrátí HTML se všemi tagy zadaného typu upravenými handlerem. Not suitable for returning content of entire page.
public static string ReturnApplyToAllTags(string text, string tagName, EditHtmlWidthHandler handler, string value)
Parameters
textstringThe source code of the entire page.
tagNamestringThe tag name to search for (div, a, etc.).
handlerEditHtmlWidthHandlerThe handler method to apply to each tag.
valuestringOptional parameter passed to the handler.
Returns
- string
Modified HTML content.
ReturnTag(HtmlNode, string)
EN: Returns the first child tag matching the specified tag name. CZ: Vrátí první podřízený tag odpovídající zadanému názvu tagu. Returns null if tag is not found.
public static HtmlNode? ReturnTag(HtmlNode htmlNode, string tagName)
Parameters
Returns
- HtmlNode
First matching HTML node or null.
ReturnTagRek(HtmlNode, object)
Recursively returns the first tag matching specified tag name.
public static HtmlNode ReturnTagRek(HtmlNode htmlNode, object tagName)
Parameters
Returns
- HtmlNode
First matching tag or null.
ReturnTagRek(HtmlNode, string)
EN: Recursively returns the first tag matching specified tag name. CZ: Rekurzivně vrátí první tag odpovídající zadanému názvu tagu.
public static HtmlNode? ReturnTagRek(HtmlNode htmlNode, string tagName)
Parameters
Returns
- HtmlNode
First matching tag or null.
ReturnTagWithAttr(HtmlNode, string, string, string)
EN: Returns the first tag with specified attribute name and value. Returns null if not found. CZ: Vrátí první tag se zadaným názvem atributu a hodnotou. Vrátí null pokud není nalezen.
public static HtmlNode? ReturnTagWithAttr(HtmlNode htmlNode, string tag, string attributeName, string value)
Parameters
htmlNodeHtmlNodeThe HTML node to search in.
tagstringThe tag name to search for.
attributeNamestringThe attribute name to match.
valuestringThe attribute value to match.
Returns
- HtmlNode
First matching HTML node or null.
ReturnTagWithAttrRek(HtmlNode, string, string, string)
EN: Returns the first tag with specified name and attribute value, recursively searching the node tree. CZ: Vrátí první tag se zadaným názvem a hodnotou atributu, rekurzivně prohledá strom uzlů. Returns null if tag is not found.
public static HtmlNode? ReturnTagWithAttrRek(HtmlNode htmlNode, string tagName, string attributeName, string attributeValue)
Parameters
htmlNodeHtmlNodeThe HTML node to search in.
tagNamestringThe tag name to search for.
attributeNamestringThe attribute name to match.
attributeValuestringThe attribute value to match.
Returns
- HtmlNode
First matching HTML node or null.
ReturnTags(HtmlNode, string)
EN: Returns all immediate child tags matching the specified tag name (non-recursive). CZ: Vrátí všechny přímé podřízené tagy odpovídající zadanému názvu (nerekurzivně). Wildcard "*" can be passed but wouldn't make much sense.
public static IList<HtmlNode> ReturnTags(HtmlNode htmlNode, string tagName)
Parameters
Returns
ReturnTagsRek(HtmlNode, string)
EN: Returns all tags matching the specified tag name, recursively searching the node tree. CZ: Vrátí všechny tagy odpovídající zadanému názvu tagu, rekurzivně prohledá strom uzlů. Supports wildcard "*" to match all tags.
public static IList<HtmlNode> ReturnTagsRek(HtmlNode htmlNode, string tagName)
Parameters
htmlNodeHtmlNodeThe HTML node to search in.
tagNamestringThe tag name to search for, or "*" for all tags.
Returns
ReturnTagsWithAttrRek(HtmlNode, string, string, string)
EN: Returns all tags matching specified name and attribute value, recursively searching the node tree. CZ: Vrátí všechny tagy odpovídající zadanému názvu a hodnotě atributu, rekurzivně prohledá strom uzlů. Supports wildcard "" for tag name to match all tags. Supports wildcard "" for attribute value to match any value.
public static IList<HtmlNode> ReturnTagsWithAttrRek(HtmlNode htmlNode, string tagName, string attributeName, string attributeValue)
Parameters
htmlNodeHtmlNodeThe HTML node to search in.
tagNamestringThe tag name to search for, or "*" for all tags.
attributeNamestringThe attribute name to match.
attributeValuestringThe attribute value to match, or "*" for any value.
Returns
ReturnTagsWithAttrRek2(HtmlNode, string, string, string)
EN: Returns all tags with specified name and attribute value, recursively searching the node tree. CZ: Vrátí všechny tagy se zadaným názvem a hodnotou atributu, rekurzivně prohledá strom uzlů. Originally from HtmlDocument.
public static IList<HtmlNode> ReturnTagsWithAttrRek2(HtmlNode htmlNode, string tagName, string attributeName, string attributeValue)
Parameters
htmlNodeHtmlNodeThe HTML node to search in.
tagNamestringThe tag name to search for.
attributeNamestringThe attribute name to match.
attributeValuestringThe attribute value to match.
Returns
ReturnTagsWithContainsAttrRek(HtmlNode, string, string, string)
EN: Returns all tags with attribute value containing specified text, recursively searching the node tree. CZ: Vrátí všechny tagy s hodnotou atributu obsahující zadaný text, rekurzivně prohledá strom uzlů. Supports wildcard "*" for tag name to match all tags.
public static IList<HtmlNode> ReturnTagsWithContainsAttrRek(HtmlNode htmlNode, string tagName, string attributeName, string attributeValue)
Parameters
htmlNodeHtmlNodeThe HTML node to search in.
tagNamestringThe tag name to search for, or "*" for all tags.
attributeNamestringThe attribute name to check.
attributeValuestringThe attribute value to search for.
Returns
ReturnTagsWithContainsAttrRek(HtmlNode, string, string, string, bool, bool)
EN: Returns all tags with attribute value matching specified criteria, recursively searching the node tree. CZ: Vrátí všechny tagy s hodnotou atributu odpovídající zadaným kritériím, rekurzivně prohledá strom uzlů.
public static IList<HtmlNode> ReturnTagsWithContainsAttrRek(HtmlNode htmlNode, string tagName, string attributeName, string attributeValue, bool isContains, bool isRecursively)
Parameters
htmlNodeHtmlNodeThe HTML node to search in.
tagNamestringThe tag name to search for.
attributeNamestringThe attribute name to check.
attributeValuestringThe attribute value to search for.
isContainsboolWhether to use Contains instead of exact match.
isRecursivelyboolWhether to search recursively.
Returns
ReturnTagsWithContainsClassRek(HtmlNode, string, string)
EN: Returns all tags with class attribute containing specified class name, recursively searching the node tree. CZ: Vrátí všechny tagy s atributem class obsahujícím zadaný název třídy, rekurzivně prohledá strom uzlů. Supports wildcard "*" for tag name to match all tags.
public static IList<HtmlNode> ReturnTagsWithContainsClassRek(HtmlNode htmlNode, string tagName, string className)
Parameters
htmlNodeHtmlNodeThe HTML node to search in.
tagNamestringThe tag name to search for, or "*" for all tags.
classNamestringThe class name to search for.
Returns
StripAllTags(string)
EN: Strips all HTML tags from text, replacing them with a single space. CZ: Odstraní všechny HTML tagy z textu, nahradí je jednou mezerou.
public static string StripAllTags(string text)
Parameters
textstringThe text to strip tags from.
Returns
- string
Text without HTML tags.
StripAllTags(string, string)
EN: Strips all HTML tags from text, replacing them with a specified replacement string. CZ: Odstraní všechny HTML tagy z textu, nahradí je zadaným řetězcem.
public static string StripAllTags(string text, string replacement)
Parameters
Returns
- string
Text without HTML tags.
StripAllTagsList(string)
EN: Strips all HTML tags from text and returns individual words as a list. CZ: Odstraní všechny HTML tagy z textu a vrátí jednotlivá slova jako seznam. Use RemoveAllNodes when need to remove also inner HTML.
public static IList<string> StripAllTagsList(string text)
Parameters
textstringThe HTML text to process.
Returns
StripAllTagsSpace(string)
EN: Strips all HTML tags from text, replacing them with a space. CZ: Odstraní všechny HTML tagy z textu, nahradí je mezerou. Replaces every tag <*> with a space. Inner non-XML content is left as is.
public static string StripAllTagsSpace(string text)
Parameters
textstringThe text to strip tags from.
Returns
- string
Text without HTML tags.
ToXml(string)
EN: Converts HTML to XML format, removing XML declaration. CZ: Převede HTML do XML formátu, odstraní XML deklaraci. Already calls RemoveXmlDeclaration and ReplaceHtmlNonPairTagsWithXmlValid.
public static string ToXml(string xml)
Parameters
xmlstringThe HTML content to convert.
Returns
- string
XML-formatted content without declaration.
ToXml(string, bool)
EN: Converts HTML to XML format, optionally removing XML declaration. CZ: Převede HTML do XML formátu, volitelně odstraní XML deklaraci. Already calls ReplaceHtmlNonPairTagsWithXmlValid.
public static string ToXml(string xml, bool isRemoveXmlDeclaration)
Parameters
xmlstringThe HTML content to convert.
isRemoveXmlDeclarationboolWhether to remove the XML declaration.
Returns
- string
XML-formatted content.
ToXmlFinal(string)
EN: Converts HTML to final XML format by replacing non-pair tags with XML-valid versions and removing XML declarations. CZ: Převede HTML do finálního XML formátu nahrazením nepárových tagů XML-validními verzemi a odstraněním XML deklarací.
public static string ToXmlFinal(string xml)
Parameters
xmlstringThe HTML/XML content to convert.
Returns
- string
XML with UTF-8 declaration and valid non-pair tags.
TrimNode(HtmlNode)
Trims whitespace from an HTML node's inner content.
public static HtmlNode TrimNode(HtmlNode htmlNode)
Parameters
htmlNodeHtmlNodeThe HTML node to trim.
Returns
- HtmlNode
The trimmed HTML node.
TrimOpenAndEndTags(string, string)
Removes opening and closing tags from HTML string.
public static string TrimOpenAndEndTags(string html, string nameOfTag)
Parameters
Returns
- string
HTML without specified opening and closing tags.
TrimTexts(HtmlNodeCollection)
Trims whitespace from all HTML nodes in a collection.
public static IList<HtmlNode> TrimTexts(HtmlNodeCollection htmlNodeCollection)
Parameters
htmlNodeCollectionHtmlNodeCollectionThe HTML node collection to trim.
Returns
TrimTexts(IList<HtmlNode>)
Trims whitespace from all HTML nodes in a list, removing text nodes.
public static IList<HtmlNode> TrimTexts(IList<HtmlNode> nodes)
Parameters
Returns
TrimTexts(IList<HtmlNode>, bool, bool)
Trims whitespace from all HTML nodes in a list, optionally removing text nodes and comments.
public static IList<HtmlNode> TrimTexts(IList<HtmlNode> nodes, bool isRemoveTextNodes, bool isRemoveComments = false)
Parameters
nodesIList<HtmlNode>The list of HTML nodes to trim.
isRemoveTextNodesboolWhether to remove text nodes.
isRemoveCommentsboolWhether to remove comments.