维生素C.net

We cannot solve our problems with the same thinking we used when we created them
posts - 136, comments - 416, trackbacks - 10, articles - 0
  博客园 :: 首页 :: 新随笔 :: 联系 :: 订阅 订阅 :: 管理

置顶随笔

请在评论中留下您的:

ID
昵称
博客地址
专注行业

为了避免spam给您造成的不便影响请勿留下email等信息.

博客园training team到两个QQ群现在都人人满,十一期间将清理2个月内未在群里发言的用户.如果哪位园子的朋友还能建群的话麻烦您帮忙, 先行谢过.

-------
例子:

ID:fanweixiao
昵称:new 维生素.net()
blog:www.cnblogs.com/fanweixiao
专注行业:web 

posted @ 2008-09-27 13:59 new 维生素C.net() 阅读(310) | 评论 (68)编辑

2009年6月29日

http://www.computerhope.com/batch.htm

posted @ 2009-06-29 13:52 new 维生素C.net() 阅读(10) | 评论 (0)编辑

使用-crash模式抓的dump文件,打开2nd chance exception的dump

0:037> .loadby sos mscorwks
0:037> .reload
................................................................
................................................................
.............................
Missing image name, possible paged-out or corrupt data.
Loading unloaded module list
...
0:037>
kb 2000
RetAddr           : Args to Child                                                           : Call Site
00000642`7f53c8e5 : 00000000`04553010 00000000`00000055 00000000`e0434f4d 00000000`04553010 :
kernel32!RaiseException+0x5c
00000642`7f8b55e7 : 00000000`8252c780 00000000`00000000 00000642`00000000 00000642`00000001 :
mscorwks!RaiseTheExceptionInternalOnly+0x295
00000642`7f8b62c6 : 00002026`00000001 00000642`00000000 00008653`fba98e98 00000000`00000000 :
mscorwks!RaiseTheException+0x57
00000642`7f938b55 : 00000000`00000055 00000000`04553001 00000000`00000015 00000000`00000000 :
mscorwks!BStrFromString+0x66
00000642`7f938b6b : 00000000`8252c780 00000000`00000000 00000000`00000000 00000000`00000000 :
mscorwks!RealCOMPlusThrow+0x35
00000642`7f956b56 : 00000000`07d9f7c0 ffffffff`ffffffff 00000000`00000003 00000000`00000000 :
mscorwks!RealCOMPlusThrow+0xb
00000642`7f7fc3b8 : 00000000`00000000 00000000`07d9f9c8 ffffffff`00000001 00000000`044ad2e0 :
mscorwks!Thread::RaiseCrossContextException+0x2d6
00000642`7f447f0d : 00000000`00000000 00000000`00000001 00000000`00000000 00000642`7f50946a :
mscorwks!`string'+0x62638
00000642`7f556aa9 : 00000000`0d4654cd 00000642`7f496500 00000000`00000000 00000000`07d9fba8 :
mscorwks!Thread::DoADCallBack+0x4ad
00000642`7f43afdd : 00000000`044ad2e0 00000000`04553010 00000000`07d9fab0 00000000`0010a0a0 :
mscorwks!CNgenEntryBind::Create+0x15d
00000642`7f435296 : 00000000`07d9fba8 ffffffff`ffffffff 00000000`04553010 00000000`07d9e710 :
mscorwks!MethodTable::IsAbstract+0x49
00000642`7f4162bb : ffffffff`fffffffe 00000000`00000001 ffffffff`fffffffe 00000000`0a51cd50 : mscorwks!
AddTimerCallbackEx+0xba
00000642`7f495fa7 : ffffffff`fffffffe 00000000`00000001 00000000`00000000 00000000`00000001 :
mscorwks!ThreadpoolMgr::AsyncTimerCallbackCompletion+0x53
00000642`7f4aad0a : 00000000`00000001 00000000`00000000 00000000`00000002 00000000`04553010 :
mscorwks!UnManagedPerAppDomainTPCount::DispatchWorkItem+0x157
00000642`7f41f9a0 : 00000000`00000000 00000000`00000000 00000000`07d9ff50 00000000`00000000 :
mscorwks!ThreadpoolMgr::WorkerThreadStart+0x1ba
00000000`77d6b6da : 00000000`77d6b6a0 00000000`00000000 00000000`00000000 00000000`07d9ffa8 :
mscorwks!Thread::intermediateThreadProc+0x78
00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : kernel32!BaseThreadStart+0x3a

0:037> !clrstack
OS Thread Id: 0x17e0 (37)
Child-SP         RetAddr          Call Site

0:037> kL 2000
Child-SP          RetAddr           Call Site
00000000`07d9f3d0 00000642`7f53c8e5 kernel32!RaiseException+0x5c
00000000`07d9f4a0 00000642`7f8b55e7 mscorwks!RaiseTheExceptionInternalOnly+0x295
00000000`07d9f570 00000642`7f8b62c6 mscorwks!RaiseTheException+0x57
00000000`07d9f5a0 00000642`7f938b55 mscorwks!BStrFromString+0x66
00000000`07d9f5d0 00000642`7f938b6b mscorwks!RealCOMPlusThrow+0x35
00000000`07d9f640 00000642`7f956b56 mscorwks!RealCOMPlusThrow+0xb
00000000`07d9f670 00000642`7f7fc3b8 mscorwks!Thread::RaiseCrossContextException+0x2d6
00000000`07d9f880 00000642`7f447f0d mscorwks!`string'+0x62638
00000000`07d9f9f0 00000642`7f556aa9 mscorwks!Thread::DoADCallBack+0x4ad
00000000`07d9fa40 00000642`7f43afdd mscorwks!CNgenEntryBind::Create+0x15d
00000000`07d9fb10 00000642`7f435296 mscorwks!MethodTable::IsAbstract+0x49
00000000`07d9fb50 00000642`7f4162bb mscorwks!AddTimerCallbackEx+0xba
00000000`07d9fc10 00000642`7f495fa7 mscorwks!ThreadpoolMgr::AsyncTimerCallbackCompletion+0x53
00000000`07d9fc70 00000642`7f4aad0a mscorwks!
UnManagedPerAppDomainTPCount::DispatchWorkItem+0x157
00000000`07d9fd10 00000642`7f41f9a0 mscorwks!ThreadpoolMgr::WorkerThreadStart+0x1ba
00000000`07d9fdb0 00000000`77d6b6da mscorwks!Thread::intermediateThreadProc+0x78
00000000`07d9ff80 00000000`00000000 kernel32!BaseThreadStart+0x3a

0:037> !pe (打印出这个exception)
Exception object: 000000008252c780
Exception type: System.NullReferenceException
Message: Object reference not set to an instance of an object.
InnerException: <none>
StackTrace (generated):
<none>
StackTraceString: <none>
HResult: 80004003

0:037> !dso(dump stack object)
OS Thread Id: 0x17e0 (37)
RSP/REG          Object           Name
0000000007d9f468 000000008252c780 System.NullReferenceException
0000000007d9f570 000000008252c780 System.NullReferenceException
0000000007d9f640 000000008252c780 System.NullReferenceException
0000000007d9f6b0 000000008252c780 System.NullReferenceException
0000000007d9f6e8 000000008232a548 System.NullReferenceException<---把最原始的异常找出来)
0000000007d9f740 00000001600aa398 System.Threading.Thread

0:037> !pe 000000008232a548(看最原始的异常)
Exception object: 000000008232a548
Exception type: System.NullReferenceException
Message: Object reference not set to an instance of an object.
InnerException: <none>
StackTrace (generated):
    SP               IP               Function
    0000000007D9C600 00000642BD3E863C System_Web_ni!
System.Web.SessionState.SessionStateModule.PollLockedSessionCallback(System.Object)+0x2d9a2c
    0000000007D9EB40 000006427830C878 mscorlib_ni!System.Threading.ExecutionContext.runTryCode
(System.Object)+0x178
    0000000007D9F3F0 0000000000000001 mscorlib_ni!
System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode, CleanupCode, System.Object)+0x2
    0000000007D9F3F0 00000642782F1702 mscorlib_ni!System.Threading.ExecutionContext.Run
(System.Threading.ExecutionContext,System.Threading.ContextCallback, System.Object)+0x62
    0000000007D9F440 000006427834D696 mscorlib_ni!
System.Threading._TimerCallback.PerformTimerCallback(System.Object)+0x86

StackTraceString: <none>
HResult: 80004003

posted @ 2009-06-29 13:52 new 维生素C.net() 阅读(12) | 评论 (0)编辑

2009年6月16日

使用的命令是lrz和lsz

lrzsz的下载地址在这:http://www.ohse.de/uwe/software/lrzsz.html

posted @ 2009-06-16 11:24 new 维生素C.net() 阅读(21) | 评论 (0)编辑

2009年5月30日

常用的备份方式是Full backup和Simulated incremental两种,即我们说的全量备份和增量备份。前者备份慢,恢复快;后者相反。这两种方式往往同时使用,比如每周一次的Full back,每天一次和多次的Incremental。这样才能保证备份的目的,能快速备份能快速恢复。

但是这个方式还不是最好的。还有一种比较好的方式是True incremental(差分备份)能达到快速备份和快速恢复的目的。

posted @ 2009-05-30 19:07 new 维生素C.net() 阅读(19) | 评论 (0)编辑

在3.5之前,我们拿GC是没有办法的。通常情况下我们的web server都是多线程+多核的,这时我们使用的是ServerGC模式。ServerGC最大的特点在于当GC发生时禁止了所有内存的分配活动,也就是说程序此时相当于停止响应了。但是这里带来的好处是将会有#procs * #cores的GC来一起完成任务,速度也快了很多。

但是对于高负载的asp.net来讲,GC是个问题,如果程序写的有问题,很有有可能就让工作的thread和GC的thread产生了lock等等等等…

所以在.net3.5出来的时候,GC可以在触发前通知程序,当时给我最直接的感觉就是当一台Web Server在触发GC之时就把负载转移到另一台Web Server上… 很帅~

与该功能有关的可以参考下面的几个方法:

GC.RegisterForFullGCNotification

GC.WaitForFullGCApproach

GCNotificationStatus

GC.WaitForFullGCComplete

GC.CancelFullGCNotification

在4.0中的Background mode的GC是允许在一个完整的background GC时开启一个新的GC(gen0和gen1),这样就可以在必要时分配一个新段(segment),这样可以避免很多引发blocking的情况了。

posted @ 2009-05-30 15:38 new 维生素C.net() 阅读(42) | 评论 (1)编辑

2009年5月29日

为什么有这篇blog

BeITMemcached Client这个Memcached的client是.net下开源的实现的相对不错的客户端,其中使用的是修改后的Fnv32算法作为Consistent hashing算法,但是经过测试发现该算法存在很大的问题。于是以.net下的MD5算法为基础修改的算法来替代,结果8错~~~

为什么会有Consistent Hashing算法:

为了更高效更安全更可信赖的Scale-Out

测试方法:
1.假定有3台cache server,图片中黄色文字为cache server的key
2.使用Guid.ToString('N')作为测试数据的cache key
3.10,000个测试数据先分布到现有的服务器上,再模拟增加服务器,然后rehash,比较两次分布情况。

【先来看Fnv32算法的改良版使用的情况】

情况1:增加一台cache server后

情况2:增加1倍的cache server后

【使用以MD5为基础改进的算法】

情况1:增加一台cache server后

情况2:增加1倍的cache server后

------总结-------------(华丽的分割线)-----------------------------

【基于Fnv32算法改良版的最大缺陷】

a) 分布不均匀。大多数load balance采用的算法都与这个的结果差不多,第一台server往往会承受较大的压力,所以我们在做负载均衡的时候,第一台服务器都要选择比其他服务器性能更好的。而cache server的最大特点是要保持高命中率,当某台server承载的数据量超过该memcached使用的内存后,就会自动使用老的数据的存储空间来存储新来的数据,那么这台cache server基本上了讲就丧失了cache的功效,增加了系统的压力。

b)某一台server的数据迁移现象非常严重。看书长大的孩子和崇拜技术牛人的孩子一样,总是以最终计算的数字来比较好坏。而distribution和load balance的核心是什么?依照我的经验和认识,这个核心就是符合自然界最基本的规律——平衡,达到均衡的状态,才是优美的。

----------关于cache使用的经验------------

根据在.net下使用缓存的经验,就说几点可能大家存在误解的:

1.要缓存的东西本身是比较少的——不要为了使用缓存而增加系统复杂度。
2.有了缓存系统抗压能力可以上升很大,就忽视了db的满载时的处理能力——缓存从业务架构上来讲经常是不可靠的。
3.缓存的关键不是把东西扔到难为GC的地方——缓存要的是命中率,当然如果不是asp.net进程内就没关系。
4.memcached不是万能的——缓存的最大潜力是让数据离执行请求的web server的cpu更近。

posted @ 2009-05-29 03:31 new 维生素C.net() 阅读(59) | 评论 (0)编辑

2009年5月27日

>>>>Boom and bust

我做互联网的几年,最深刻的感悟就是“18个月架构”——从公司对网站的投资回报模型开始,到产品部门设计产品,最后才走到功能实现的流程,当我拿到这份功能需求的时候,我更希望看到的是一个roadmap,一个不多不少,刚好18个月的产品规划。

为什么是18个月?没有什么很科学的解释,完全是对成功的互联网企业产品的分析和自己的亲身经历。越失败的公司在设计产品产品的时候考虑的路子越窄,当然这里是排除有创新的技术实力雄厚的员工数量很少的互联网企业。一个优秀的互联网企业,最直观的表现就是对待产品的态度,最终要做什么,每一步怎么实现,装饰性功能带来效应的线性增长状况和催化效应等等,都应该考虑在18个月内。

为什么一个简单的不得了的功能一定要费很大的精力去做?为什么要破坏书本上和某某技术大牛的经验之谈去写bad smell的代码?website的架构在高可扩展性、高可用性和高并发支持外,还有没有更重要的?

都是搞技术的,我们从starbucks对http的理解上来看:

"Even those who are Web-savvy often struggle to understand that the Web isn't about middleware solutions supporting XML over HTTP, nor is it a crude RPC mechanism. This is a shame because the Web has much more value than simple point-to-point connectivity; it is in fact a robust integration platform."

我挺赞同他们对http的理解,我也认为集合了原子性操作的http其实是扮演着集成平台的角色。当我们把所有功能划分为Get和Post操作并keep focus在这些问题上的时候,忽略了“战略”。战略就是产品的roadmap,是盈利模式的逐步实现。因此web的架构,还有一个维度,我喜欢叫做“产品战略架构”,也就是大家都常听到的“应用架构”。

因此,对于web的架构,是应用架构和系统架构的平衡。因为要去选择这个平衡,所以要放弃很多“看起来很美好的东西”。web的架构不会十全十美,不会永远都是可以随意靠Scale-Out来解决问题,走到一定的阶段,重写比重构更经济,当一个架构无法再承载目前的数据和访问压力的时候,重新设计一套方案往往都会是比较正确的选择——产品或者其某一部分已经达到了一个临界点,可以在这上面做文章了!假设这时候我们已经经历了M个月,那么剩下的18-M个月已经不再那么重要,互联网的产品的变化往往很难预计,当在一个相对成熟的产品之上再做“战略”的时候,产品应该提出一个6个月的路线图,这时的架构需要去重构,去扩张,去做性能优化以支撑,这6个月往往都是实验性的产品变化,当相对稳定的时候,再去考虑该如何重写。

回到文章的标题,wikipedia上对于Boom and bust这个经济学里的词汇是这样定义的“Boom and bust phenomenon have existed for centuries. During a "boom" period, buyers find themselves paying increasingly higher prices until the "bust", at which time the goods and commodities for which they have paid inflated prices may end up as valueless or nearly so.”。可惜的是没有人对该词条做过中文翻译,大概意思是说,在经济繁荣的时期,买家支付越来越高的价格,导致商品因为他们支付的价格膨胀,直到经济萧条的时候买家不再提高价格了,但是这时候他们才发现这些商品已经是豪无价值了。

所以我对于web架构的理解,是:产品设计与系统架构是紧密相关的,架构的不是功能,而是产品和其与其他产品的集成,并保证整个系统在高并发高负载下的高可用性。

posted @ 2009-05-27 02:05 new 维生素C.net() 阅读(23) | 评论 (0)编辑

2009年5月12日

原文地址:http://www.dbms2.com/2009/05/11/facebook-hadoop-and-hive/

I few weeks ago, I posted about a conversation I had with Jeff Hammerbacher of Cloudera, in which he discussed a Hadoop-based effort at Facebook he previously directed. Subsequently, Ashish Thusoo and Joydeep Sarma of Facebook contacted me to expand upon and in a couple of instances correct what Jeff had said. They also filled me in on Hive, a data-manipulation add-on to Hadoop that they developed and subsequently open-sourced.

Updating the metrics in my Cloudera post,

  • Facebook has 400 terabytes of disk managed by Hadoop/Hive, with a slightly better than 6:1 overall compression ratio. So the 2 1/2 petabytes figure for user data is reasonable.
  • Facebook’s Hadoop/Hive system ingests 15 terabytes of new data per day now, not 10.
  • Hadoop/Hive cycle times aren’t as fast as I thought I heard from Jeff. Ad targeting queries are the most frequent, and they’re run hourly. Dashboards are repopulated daily.

Nothing else in my Cloudera post was called out as being wrong.

In a new-to-me metric, Facebook has 610 Hadoop nodes, running in a single cluster, due to be increased to 1000 soon. Facebook thinks this is the second-largest* Hadoop installation, or else close to it. What’s more, Facebook believes it is unusual in spreading all its apps across a single huge cluster, rather than doing different kinds of work on different, smaller sub-clusters.

*Apparently, Yahoo is at 2000 nodes (and headed for 4000), 1000 or so of which are operated as a single cluster for a single app.

Facebook decided in 2007 to move what was then a 15 terabyte big-DBMS-vendor data warehouse to Hadoop — augmented by Hive — rather than to an MPP data warehouse DBMS. Major drivers of the choice included:

  • License/maintenance costs. Free is a good price.
  • Open source flexibility. Facebook is one of the few users I’ve ever spoken with that actually cares about modifying open source code.
  • Ability to run on cheap hardware. Facebook runs real-time MySQL instances on boxes that cost $10K or so, and would expect to pay at least as much for an MPP DBMS node. But Hadoop nodes run on boxes that cost no more than $4K, and sometimes (depending e.g. on whether they have any disk at all) as little as $2K. These are “true” commodity boxes; they don’t even use RAID.
  • Ability to scale out to lots of nodes. Few of the new low-cost MPP DBMS vendors have production systems even today of >100 nodes. (Actually, I’m not certain that any except Netezza do, although Kognitio in a prior release of its technology once built a 900ish node production system.)
  • Inherently better performance. Correctly or otherwise, the Facebook guys thought that Hadoop had performance advantages over DBMS, due to the lack of overhead associated with transactions and so on.

One option Facebook didn’t seriously consider was sticking with the incumbent, which Facebook folks regarded as “horrible” and a “lost cause.” The daily pipeline took more than 24 hours to process. Although aware that its big-DBMS-vendor warehouse could probably be tuned much better, Facebook didn’t see that as a path to growing its warehouse more than 100-fold.  (But based on my discussion with Cloudera, I gather that vendor’s DBMS is indeed used to run some reporting today.)

Reliability of Facebook’s Hadoop/Hive system seems to be so-so. It’s designed for a few nodes at a time to fail; that’s no biggie. There’s a head node that’s a single-point of failure; while there’s a backup node, I gather failover takes 15 minutes or so, a figure the Facebook guys think they could reduce substantially if they put their minds to it. But users submitting long-running queries don’t seem to mind delays of up to an hour, as long as they don’t have to resubmit their queries. Keeping ETL up is a higher priority than keeping query execution going. Data loss would indeed be intolerable, but at that level Hadoop/Hive seems to be quite trustworthy.

There also are occasional longer partial(?) outages, when an upgrade introduces a bug or something, but those don’t seem to be a major concern.

Facebook’s variability in node hardware raises an obvious question — how does Hadoop deal with heterogeneous hardware among its nodes? Apparently a fair scheduling capability has been built for Hadoop, with Facebook as the first major user and Yahoo apparently moving in that direction as well. As for inputs to the scheduler (or any more primitive workload allocator) — well, that depends on the kind of heterogeneity.

  • Disk heterogeneity — a distributed file system reports back about disk.
  • CPU heterogeneity — different nodes can be configured to run different numbers of concurrent tasks each.
  • RAM heterogeneity — Hadoop does not understand the memory requirements of each task, and does not do a good job of matching tasks to boxes accordingly. But the Hadoop community is working to fix this.

Further notes on Hive

Without Hive, some basic Hadoop data manipulations can be a pain in the butt. A GROUP BY or the equivalent could take >100 lines of Java or Python code, and unless the person writing it knew something about database technologically, it could use some pretty sub-optimal algorithms even then. Enter Hive.

Hive sets out to fix this problem. Originally developed at Facebook (in Java, like Hadoop is), Hive was open-sourced last summer, by which time its SQL interface was in place, and now has 6 main developers. The essence of Hive seems to be:

  • An interface that implements a subset of SQL
  • Compilation of that SQL into a MapReduce configuration file.
  • An execution engine to run same.

The SQL implemented so far seems to, unsurprisingly be, what is most needed to analyze Facebook’s log files. I.e., it’s some basic stuff, plus some timestamp functionality. There also is an extensibility framework, and some ELT functionality.

Known users of Hive include Facebook (definitely in production) and hi5 (apparently in production as well). Also, there’s a Hive code committer from Last.fm.

Other links about huge data warehouses:

posted @ 2009-05-12 19:43 new 维生素C.net() 阅读(107) | 评论 (0)编辑

2009年5月9日

Yes, you can do that. Compile your views for any release build you are trying to do. This will make sure everything compiles nicely and your users don’t see an “Error 500” when accessing a view. Of course, errors can still happen, but at least, it will not be the view’s fault anymore.

Here’s how you compile your views:

1. Open the project file in a text editor. For example, start Notepad and open the project file for your ASP.NET MVC application (that is, MyMvcApplication.csproj).

2. Find the top-most <PropertyGroup> element and add a new element <MvcBuildViews>:

<PropertyGroup>

...
<MvcBuildViews>true</MvcBuildViews>

</PropertyGroup>

3. Scroll down to the end of the file and uncomment the <Target Name="AfterBuild"> element. Update its contents to match the following:

<Target Name="AfterBuild" Condition="'$(MvcBuildViews)'=='true'">

<AspNetCompiler VirtualPath="temp"
PhysicalPath="$(ProjectDir)\..\$(ProjectName)" />
</Target>

4. Save the file and reload the project in Visual Studio.

Enabling view compilation may add some extra time to the build process. It is recommended not to enable this during development as a lot of compilation is typically involved during the development process.

posted @ 2009-05-09 17:07 new 维生素C.net() 阅读(27) | 评论 (1)编辑

SELECT OBJECT_NAME(OBJECT_ID) AS DatabaseName, last_user_update,*
FROM sys.dm_db_index_usage_stats
WHERE database_id = DB_ID( 'AdventureWorks')
AND OBJECT_ID=OBJECT_ID('test')

posted @ 2009-05-09 16:53 new 维生素C.net() 阅读(29) | 评论 (0)编辑