Class HtmlTableParser
- Namespace
- SunamoHtml.Html
- Assembly
- SunamoHtml.dll
EN: Parser for HTML tables into 2D string array with colspan support. CZ: Parser HTML tabulek do 2D string pole s podporou colspan. Row/column indexing.
public sealed class HtmlTableParser
- Inheritance
-
HtmlTableParser
- Inherited Members
- Extension Methods
Constructors
HtmlTableParser(HtmlNode, bool)
Initializes a new instance by parsing an HTML table node.
public HtmlTableParser(HtmlNode html, bool isIgnoreFirstRow)
Parameters
htmlHtmlNodeThe HTML table node to parse.
isIgnoreFirstRowboolWhether to ignore the first row (typically headers).
Properties
ColumnCount
Gets the number of columns in the parsed table.
public int ColumnCount { get; }
Property Value
Data
EN: The parsed table data. If an element contains null, it was a colspan cell. CZ: Naparsovaná tabulková data. Pokud prvek obsahuje null, jednalo se o colspan buňku.
[SuppressMessage("Performance", "CA1819")]
public string[][] Data { get; set; }
Property Value
- string[][]
RowCount
Gets the number of rows in the parsed table.
public int RowCount { get; }
Property Value
Methods
ColumnValues(int, bool, bool, bool)
Gets all values from a specific column by index.
public IList<string> ColumnValues(int columnIndex, bool isNormalizeValuesInColumn, bool isRemoveAlsoInnerHtmlOfSubNodes, bool isSkipFirstRow)
Parameters
columnIndexintThe zero-based column index.
isNormalizeValuesInColumnboolWhether to normalize values by removing HTML.
isRemoveAlsoInnerHtmlOfSubNodesboolWhether to remove inner HTML of sub nodes.
isSkipFirstRowboolWhether to skip the first row.
Returns
ColumnValues(string, bool, bool)
Gets all values from a column by column name (found in first row).
public IList<string> ColumnValues(string columnName, bool isNormalizeValuesInColumn, bool isRemoveAlsoInnerHtmlOfSubNodes)
Parameters
columnNamestringThe column name to search for in the first row.
isNormalizeValuesInColumnboolWhether to normalize values by removing HTML.
isRemoveAlsoInnerHtmlOfSubNodesboolWhether to remove inner HTML of sub nodes.
Returns
NormalizeValuesInColumn(IList<string>, bool)
Normalizes values in a column by removing HTML tags and decoding HTML entities.
public static void NormalizeValuesInColumn(IList<string> chars, bool isRemoveAlsoInnerHtmlOfSubNodes)