I thought that I understood how to persist text markup from the WebHTMLEditor to a database field and then reload the contents of the control later when it is requested. But I figure based on the 'will not fix' response of my bug report, maybe I don't understand. Or maybe the language barrier is getting in the way and the support team does not understand. So, I thought that I would put it to a vote here. For you IG folks, the BR number is BR31191.
Simple requirement really. Just need to have a textbox with the ability to format text and not crash my server. Kind of like the one that you have in the forum. When the page loads again, I need to retrieve the contents from the database (in a varchar(max) field) and initialize the control. I am using the .text property to do this since the .TextXHtml property is marked readonly. In some scenarios, I have seen that what comes out of the control is not exactly what I put into it. In fact, when it comes time to persist the markup back into the database, the markup I get out of the .TextXHtml property is often unbalanced, and in particular it likes to place extra unbalanced paragraph tags at the end of the markup and rip out certain other end tags in the middle and it always rips out the </body> and </html> tags at the end.
When this happens repeatedly, the end result is our worker processes crashes at the time of reading the .TextXHtml property. If this happens repeatedly, the entire application pool crashes due to the settings in the application pool. In other words, I am saying that a bug in this control is bringing down my servers.
I have not seen this happen in a scenario when the user originates and ends with text that they originated with the control. Where I have seen it pop up most often is with text which was pasted in from Word or even Excel. I have since taken steps to eliminate all markup upon pasting into the control, but what remains in the database is valid markup in the since that it would pass the validity checks of the ww3. Yet, when using this 'valid markup' I think that it sometimes confuses the control because of how the tags are nested and what I get on the backend is not valid markup. For comparison purposes of opening this ticket with IG, I prepared an example of HTML markup which passes the Visual Studio 2005 content editor checks. I initialized the control (using the paradigm below) and with that valid markup. Upon reading the contents of the control after I initialized it, I what I got back was markup with 14 errors. This is not acceptable.
The number of errors are compounded later when the text with errors is persisted into the database. For example, when I save those 14 errors to the database, if I later initialize the control with that markup with 14 errors, the next time I persist to the database I might get 50 errors. The bottom line is either the control is buggy and IG won't admit it, or I am not understanding the paradigm.
So, in summary, the flow that I am using is below. Do you recommend a different approach to initializing the contents with HTML markup and then reading the contents later? Am I wrong to think that the .TextXHtml property should give me back exactly what I put into it?
If Not Me.IsPostBack Then
End If
End Sub
Persisttodb(me.WebHTMLEditor1.TextXHtml)
The HTMLEditor doesn't support <html> or <body> tags - they will be stripped out. The editor's main function is to create or edit html formatted text, but not an entire html document.
Due to the underlying implementation of the editor, it's going to make some minor changes to text which isn't formatted in a way that Internet Explorer likes. You may remember how Visual Studio 7 used to reformat your ASPX? The html content editor that was the base of the VS design surface is the same one at fault for changing the markup in IE. This is an unfortunate implemenation detail, and we've done quite a bit to work around some of the more annoying aspects of it. Still, there will be some minor adjustments. However, the adjustments should only happen once, and shouldn't be recursive like you're seeing.
There are two possibilities that I can think of at this point. Either the TextXHtml property is improperly parsing the data and adding additional tags, or InternetExplorer is. In both of these cases, I suspect that there's some malformed HTML that's causing the problem. Most parsers have a really hard time when they reach an unterminated string, or an extraneous closing tag. I know that you mentioned that the text passes through the W3C validator - were you validating for html 4.01, or XHTML? An XHTML validator will be much stricter with quotations marks and closing tags, and may catch something that the HTML 4.01 validator missed.
The developer has looked into the problem, and consents that the text will change (for the reasons described above), but we're still trying to reproduce the problem with a controlled (small) text input.
-Tony
I'm getting this same thing, particularly when dealing with urls.
I put the following html in the editor control <a href="http://www.yahoo.com/somepage.aspx?param1=blh¶m2=blah2">yahoo</a>
I then retrieve the text via TextXhtml and it escapes the ampersands for me. I save this to a database.
Later, I come back to edit the html. I set the editor html via .Text = HtmlFromDatabase. I edit my content without touching the url part. I get the content to save to db again via TextXhtml. The html retrieved looks like this:
<a href="http://www.yahoo.com/somepage.aspx?param1=blh&amp;param2=blah2">yahoo</a>
Again, notice the escaped ampersands, but ths time, it alters the url. If you repeat the steps,the url gets longer and longer as it escapes the ampersands over and over.
What's the fix?
You need to either use .TextXHTML or .Text consistently. If you're persisting the .TextXHTML into your database, you shouldn't be then setting the .Text property on the Editor - that's what causes the constant changes. You should either stick with the .Text property or the .TextXhtml, but don't mix and match. Let me know if that helps.
But TextXHtml is readonly. I need to save the html as XHTML. How would one do this and be able to come back and edit it later? Is there a way to tell the editor that the .Text that I'm giving it is XHTML?
Pseudo code:
1. Create content
2. Get content from control via TextXHtml
3. Save content to database
4. Come back later to edit
5. Get content from database
6. Set control content via .Text = contentfromdb
7. Edit, rinse, and repeat.
I suppose line #6 is the one in question.
On another note, is it possible to apply an external stylesheet?
Is there a reason you need to push the "XHTML" representation into your database? If it's being entered in as regular html, can you persist it into your backend database as html? If the content was initially generated as XHTML, it may resolve the issue as well. That said, we may be able to do something to better handle '&' in the conversion process. I'll forward this thread onto the development team to take a look at.
There's no built in way to apply an external stylesheet, mainly because you're only constructing an HTML snippet, not an entire page - so there's no <head> element to put link tags into.
Something needs to be done about this ASAP.
If I have a simple bulleted list using the html such as:
<ul><li>Database upgrades</li></ul>
The control modifies the tags to be uppercase, i.e.
<UL><LI>Database upgrades</LI></UL>
This is not valid markup, heres what http://validator.w3.org has to say about it:
Line 221, Column 3: element "LI" undefined .
<LI>Database upgrades</LI>
✉
You have used the element named above in your document, but the document type you are using does not define an element of that name. This error is often caused by: